Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module: add releaseLoadedModule #50618

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions lib/internal/modules/esm/loader.js
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,25 @@ class ModuleLoader {
};
}

/**
* Evict saved result of `resolve` and `load` for the given import parameters.
*/
evict(specifier, parentURL, importAttributes = {}) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
evict(specifier, parentURL, importAttributes = {}) {
evict(specifier, parentURL, importAttributes = kEmptyObject) {

let resolved;
try {
resolved = this.resolveSync(specifier, parentURL, importAttributes);
} catch {
return false;
}
const requestKey = this.#resolveCache.serializeKey(specifier, importAttributes);
let didEvict = this.#resolveCache.delete(requestKey, parentURL);
if (this.loadCache.delete(resolved.url, importAttributes.type)) {
// nb: Careful with short-circuits here, we want to run both deletes unconditionally.
didEvict = true;
}
return didEvict;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we perhaps evict from the load cache first? I know these calls are sync, but the load hook is not; if evict is called while a module is being loaded (so after it’s resolved but before load has finished and it’s been added to the load cache) then a module could end up deleted from the resolve cache but not from the load cache. Whereas if you delete from the load cache first, and only on success of that do you continue on to delete from the resolve cache, we avoid leaving the caches in a “broken” state where the resolution stays cached by the load got uncached.

This raises a broader question: do we need to delete from the resolve cache? What are the use cases for this API, and could they be fulfilled by deleting only from the load cache? As in, when would we need to redo the resolution? Perhaps after new module customization hooks were registered, that would affect resolution?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleting the resolution is nice because otherwise updates to package.json cannot be hot loaded and would force an app restart. But deleting the resolve cache in the manner in my diff is definitely a half measure. I think a more reliable feature would need to record all the resolutions to a given loaded module, so that when the loaded module is evicted all resolutions to it are deleted as well. The abstract specification method HostLoadImportedModule is decomposed into our resolve and load stages-- each with their own cache. To actually clear that abstract method requires more machinery which I'm not sure is worth it.

}

/**
* Get a (possibly still pending) module job from the cache,
* or create one and return its Promise.
Expand Down Expand Up @@ -618,8 +637,17 @@ function register(specifier, parentURL = undefined, options) {
);
}

/**
* Release saved results of `resolve` and `load` for the given import parameters.
*/
function releaseLoadedModule(specifier, parentURL, importAssertions) {
const moduleLoader = require('internal/process/esm_loader').esmLoader;
return moduleLoader.evict(specifier, parentURL, importAssertions);
}

module.exports = {
createModuleLoader,
getHooksProxy,
register,
releaseLoadedModule,
};
22 changes: 22 additions & 0 deletions lib/internal/modules/esm/module_map.js
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,17 @@ class ResolveCache extends SafeMap {
return internalCache;
}

/**
* @param {string} serializedKey
* @param {string} parentURL
* @returns {boolean}
*/
delete(serializedKey, parentURL) {
const has = this.has(serializedKey, parentURL);
delete this.#getModuleCachedImports(parentURL)[serializedKey];
return has;
}

/**
* @param {string} serializedKey
* @param {string} parentURL
Expand Down Expand Up @@ -88,6 +99,17 @@ class ResolveCache extends SafeMap {
*/
class LoadCache extends SafeMap {
constructor(i) { super(i); } // eslint-disable-line no-useless-constructor
delete(url, type = kImplicitAssertType) {
validateString(url, 'url');
validateString(type, 'type');
const cachedJobsForUrl = super.get(url);
if (cachedJobsForUrl) {
const has = type in cachedJobsForUrl;
delete cachedJobsForUrl[type];
return has;
}
return false;
}
get(url, type = kImplicitAssertType) {
validateString(url, 'url');
validateString(type, 'type');
Expand Down
3 changes: 2 additions & 1 deletion lib/module.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,11 @@

const { findSourceMap } = require('internal/source_map/source_map_cache');
const { Module } = require('internal/modules/cjs/loader');
const { register } = require('internal/modules/esm/loader');
const { releaseLoadedModule, register } = require('internal/modules/esm/loader');
const { SourceMap } = require('internal/source_map/source_map');

Module.findSourceMap = findSourceMap;
Module.register = register;
Module.releaseLoadedModule = releaseLoadedModule;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cannot be available to the main thread as its effects are forbidden by ECMAScript spec. The best we can do is to have it available only in the loader thread.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain? Is this just a way of saying that hooks need to be registered before users are allowed to use this? Why?

How would this work from the other thread since the caches are on the main thread? Cross thread communication?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, the relevant specification is https://tc39.es/ecma262/#sec-HostLoadImportedModule:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we're aware that this API explicitly violates spec if used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means that the spec gives some guarantees that await import("a") will always return the same value from the same module (as long as the module resolution has succeeded at least once)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instance, as an alternative, we could choose an approach where an explicitly opt-in command line flag is required to enable this kind of non-standard ESM behaviors.

node --esm-module-ext

This makes it absolutely clear that the individual running the node.js process is deciding on the non-standard behavior as opposed to some arbitrary library they are depending on.

Copy link
Member

@GeoffreyBooth GeoffreyBooth Nov 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes it absolutely clear that the individual running the node.js process is deciding on the non-standard behavior as opposed to some arbitrary library they are depending on.

I don’t see the need for this. You might as well call the flag --allow-me-to-violate-spec. Why should a user need to opt into this via flag when they’re already opting into it by using the API, either explicitly or via a dependency that they knowingly added? If I were the author of Vite, I would be annoyed; I would need to choose between the following:

  1. Leak memory.
  2. Tell users to always run Vite with --allow-me-to-violate-spec with an explanation that doing so prevents a memory leak. (Sounds like a bad spec, doesn’t it?)
  3. Run Node in a child process with this flag set, so that it’s hidden from the user.

The third option seems like the obvious choice, and now we’re back to the user not having explicitly opted into this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps that decision should be revisited, then - otherwise an onslaught of security vulnerabilities is likely to be forthcoming where random packages use the loader api, and this one, to do literally anything they want to an application.

The built-in node:* module namespace is chocked full of security vectors and npm is already a huge hazard. Tinkering with module resolution is hardly a major concern given that node:fs and node:child_process exist. The real risk is ecosystem fragmentation, like what we have with CommonJS and ESM. This is comparable to delete require.cache[key] under CommonJS which wasn't even widely used in that ecosystem.

HMR is already possible; that’s what Vite is doing

Vite's HMR implementation is a client (browser)-side feature. For SSR you have to use await vite.ssrLoadModule('/main.js') which operates outside of nodejs's module graph as I understand it. I might be wrong here because I personally don't use Vite.

dynohot implements import.meta.hot on the server-side as a nodejs loader. To my knowledge it's the only implementation that truly operates under nodejs's module graph. I implemented the es262 linking algorithm to specification and use some code transformations to tie it all together. I'd love to see such a feature in nodejs but without support from v8 you need a code transformation, otherwise there's no way to update the linking of an already resolved and imported name. It seems like code transpilation is something that is outside the scope of node's runtime (and thus we have the loader API).

Copy link
Member

@jasnell jasnell Nov 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should a user need to opt into this via flag when they’re already opting into it by using the API, either explicitly or via a dependency that they knowingly added?

Based on experience, I would wager very few users know what most of their dependencies are doing.

More importantly, I'm not saying no to this, I'm saying I'd like us to make sure we're we're being careful to not introduce new headaches we might regret later. I guess what I'm missing is a higher level description of the use cases, how these various bits fit in with that, and a example that ties it all together. Has such a thing been written up?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#49442 has pieces of that but yes, a better writeup would be good. One of my notes on this PR is that it needs documentation, so that seems like a good place to start. @laverdet Could you please make the example written for the docs one that shows the most common use case?

Module.SourceMap = SourceMap;
module.exports = Module;
17 changes: 17 additions & 0 deletions test/es-module/test-esm-evict.mjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
import '../common/index.mjs';

Check failure on line 1 in test/es-module/test-esm-evict.mjs

View workflow job for this annotation

GitHub Actions / test-asan

--- stderr --- node:assert:172 throw err; ^ AssertionError [ERR_ASSERTION]: Unexpected global(s) found: value at process.<anonymous> (/home/runner/work/node/node/test/common/index.js:403:14) at process.emit (node:events:519:28) { generatedMessage: false, code: 'ERR_ASSERTION', actual: undefined, expected: undefined, operator: 'fail' } Node.js v22.0.0-pre Command: out/Release/node --test-reporter=spec --test-reporter-destination=stdout --test-reporter=./tools/github_reporter/index.js --test-reporter-destination=stdout /home/runner/work/node/node/test/es-module/test-esm-evict.mjs

Check failure on line 1 in test/es-module/test-esm-evict.mjs

View workflow job for this annotation

GitHub Actions / test-linux

--- stderr --- node:assert:172 throw err; ^ AssertionError [ERR_ASSERTION]: Unexpected global(s) found: value at process.<anonymous> (/home/runner/work/node/node/test/common/index.js:403:14) at process.emit (node:events:519:28) { generatedMessage: false, code: 'ERR_ASSERTION', actual: undefined, expected: undefined, operator: 'fail' } Node.js v22.0.0-pre Command: out/Release/node --test-reporter=spec --test-reporter-destination=stdout --test-reporter=./tools/github_reporter/index.js --test-reporter-destination=stdout /home/runner/work/node/node/test/es-module/test-esm-evict.mjs

Check failure on line 1 in test/es-module/test-esm-evict.mjs

View workflow job for this annotation

GitHub Actions / test-macOS

--- stderr --- node:assert:172 throw err; ^ AssertionError [ERR_ASSERTION]: Unexpected global(s) found: value at process.<anonymous> (/Users/runner/work/node/node/test/common/index.js:403:14) at process.emit (node:events:519:28) { generatedMessage: false, code: 'ERR_ASSERTION', actual: undefined, expected: undefined, operator: 'fail' } Node.js v22.0.0-pre Command: out/Release/node --test-reporter=spec --test-reporter-destination=stdout --test-reporter=./tools/github_reporter/index.js --test-reporter-destination=stdout /Users/runner/work/node/node/test/es-module/test-esm-evict.mjs
import { strictEqual } from 'node:assert';
import { releaseLoadedModule } from 'node:module';

const specifier = 'data:application/javascript,export default globalThis.value;';

globalThis.value = 1;
const instance1 = await import(specifier);
strictEqual(instance1.default, 1);
globalThis.value = 2;
const instance2 = await import(specifier);
strictEqual(instance2.default, 1);

strictEqual(releaseLoadedModule(specifier, import.meta.url), true);
strictEqual(releaseLoadedModule(specifier, import.meta.url), false);
const instance3 = await import(specifier);
strictEqual(instance3.default, 2);
Loading