Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalidate cache when using import #49442

Open
jonathantneal opened this issue Apr 5, 2019 · 144 comments
Open

Invalidate cache when using import #49442

jonathantneal opened this issue Apr 5, 2019 · 144 comments
Labels
esm Issues and PRs related to the ECMAScript Modules implementation. feature request Issues that request new features to be added to Node.js.

Comments

@jonathantneal
Copy link

How do I invalidate the cache of import?

I have a function that installs missing modules when an import fails, but the import statement seems to preserve the failure while the script is still running.

import('some-module').catch(
  // this catch will only be reached the first time the script is run because resolveMissingModule will successfully install the module
  () => resolveMissingModule('some-module').then(
    // again, this will only be reached once, but it will fail, because the import seems to have cached the previous failure
    () => import('some-module')
  )
)

The only information I found in regards to import caching was this documentation, which does not tell me where the “separate cache” used by import can be found.

No require.cache

require.cache is not used by import. It has a separate cache.
https://nodejs.org/api/esm.html#esm_no_require_cache

@devsnek
Copy link
Member

devsnek commented Apr 5, 2019

the import cache is purposely unexposed. adding a query has been the generally accepted ecosystem practice to re-import something.

however, a failure to import something will not fill the cache.

this trivial program works fine for me (assuming nope.mjs does not exist):

import fs from 'fs';

import('./nope.mjs')
  .catch(() => fs.writeFileSync('./nope.mjs'))
  .then(() => import('./nope.mjs'))
  .then(console.log);

@jonathantneal
Copy link
Author

@devsnek, hmm, might this be limited to imports that use node_modules? This similarly trivial program fails for me the first time, but not the second.

import child_process from 'child_process';

import('color-names')
  .catch(() => child_process.execSync('npm install --no-save color-names'))
  .then(() => import('color-names'))
  .then(console.log);

@bmeck
Copy link
Member

bmeck commented Apr 5, 2019 via email

@devsnek
Copy link
Member

devsnek commented Apr 5, 2019

if its just happening with node_modules it could be #26926

@MylesBorins
Copy link
Contributor

can this be closd?

@jkrems
Copy link
Contributor

jkrems commented Aug 29, 2019

I think a use case like this would hopefully be implemented as a loader. Do we already track this as a use case in that context?

@bmeck
Copy link
Member

bmeck commented Aug 30, 2019

@jkrems we have old documents with that as a feature, but no success criteria examples.

@giltayar
Copy link
Contributor

FYI, I'm implementing ESM support in Mocha (mochajs/mocha#4038), and cannot currently implement "watch mode", whereby Mocha watches the test files, and reruns them when they change. So "watch mode" in Mocha, in the first iteration, will probably not support ESM, which is a bummer.

While we could use cache busting query parameters, that would mean that we are always increasing memory usage, and old and never-to-be-used versions of the file will continue staying in memory due to the cache holding on to them.

And I'm not sure a loader would help here, as the loader also has no access to the cache.

@guybedford
Copy link
Contributor

guybedford commented Nov 26, 2019 via email

@devsnek
Copy link
Member

devsnek commented Nov 26, 2019

I'm really not a fan of the idea of our module cache being anything except insert-only. CJS cache modification is bad already, and CJS modules don't even form graphs.

Additionally, other runtimes (like browsers) will never expose this functionality, so some alternative system will have to be used for them regardless of what node does, in which case it seems like that system could just be used for node.

@bmeck
Copy link
Member

bmeck commented Nov 26, 2019

@giltayar have you looked into using Workers or other solutions to have a module cache that you can destroy (such as by killing the Worker)?

@giltayar
Copy link
Contributor

@bmeck - interesting. That would mean that the tests themselves run in Workers. While I am theoretically familiar with workers, I haven't yet had any experience with them: is any code that runs in the main process compatible with worker inside a worker? In other words, compatibility-wise, would all test code that works today in the "main process" work inside workers?

I wouldn't want Mocha to have a version (even a semver-major breaking one) where developers will need to tweak their code because now it's running inside a worker. I'm guessing that there's a vast amount of that code running inside Mocha, and any incompatibility would be a deal breaker.

@devsnek
Copy link
Member

devsnek commented Nov 28, 2019

there are differences between workers and the main thread, mostly surrounding the functions on process, like process.exit() in a worker doesn't end the process, just the thread. There's a good list here: https://nodejs.org/api/worker_threads.html#worker_threads_class_worker

@giltayar
Copy link
Contributor

Looking at the list, I can see process.chdir() is not available, which is probably a deal breaker in many tests (unit tests probably don't use process.chdir(), but Mocha is used for all sorts of tests), as is breaking some native add-ons (although I'm not sure how big of a problem this is in the real world).

I would hesitate to say this, as my only contribution to Mocha currently is this pull request, but I would guess that the owners would veto this. Or maybe allow this only if we add a --run-in-workers option. In any case, without looking too much at the code, this is probably a significant investment to implement for supporting ES Modules, as this is not a simple refactor, but rather an architectural change in how Mocha works.

@giltayar
Copy link
Contributor

If it wasn't apparent from the above, I believe I would still prefer a "module unloading" API, unless the working group is adamant and official about not having one, of course. Which would probably mean going the "subprocess"/"worker" route.

@devsnek
Copy link
Member

devsnek commented Nov 28, 2019

i admittedly don't know much about mocha... is using a separate process not doable either?

@giltayar
Copy link
Contributor

I'll go back to the Mocha contributors team with this.

@boneskull
Copy link
Contributor

Hi, I work on Mocha!! I am trying to see how we can move @giltayar's PR forward.

There are actually two situations in which "module unloading" is needed in Mocha:

  1. In "watch" mode with CJS scripts, when Mocha detects a file must be reloaded, it is deleted from require.cache and re-required, then tests are re-run. Mocha is not the only tool that does this sort of cache-busting.
  2. When developers are writing tests with Mocha (and many other test frameworks), they may want to use module-level mocking--they essentially replace one module with another phony one (I'm going to tag @theKashey here because he knows more about this ). Or even pretend like a module does not exist at all. It is then very important to Mocha that users can consume these sort of mocking frameworks to write their test code.

In the first case, it's possible, though probably at a performance cost, Mocha could leverage workers to handle ESM. I don't know enough about workers to say whether this will provide a sufficient environment for the test cases, but it feels like a misuse of the workers feature. At minimum it seems like a lot of added complexity.

In the second case, I can't see how using workers would be feasible. Test authors need to be able to mock modules on-the-fly and reference them directly from test cases, using mocking frameworks.


I don't know why this sort of behavior was omitted from the official specification. If the reasons involve "browser security", well, it further reinforces that browsers are a hostile environment for testing. I do know that this behavior is a very real need for many, from library and tooling authors down to developers working on production code.

We do need an "unload module" API; until such a thing lands, tools will be limited, implementations will be difficult (if possible), and end users will be frustrated when their tests written in ESM don't work. I will also be frustrated, because those frustrated users will complain in Mocha's issue tracker!

I'm happy to talk in further detail about use-cases, but I'm eager to put an eventual API description in the more-capable hands of people like @guybedford.

@devsnek Given that enabling it also enables tooling, I'm curious why you feel locking this sort of thing down is a better direction?

cc @nodejs/tooling

P.S. I will be at the collab summit, and the tooling group will be hosting a collaboration session; maybe this can be a topic of discussion, or vice-versa if there's a modules group meeting...?

@theKashey
Copy link

Do you need unloading API for the watch mode? Yes, you need it to update the changed module code.

However, it is enough to handle watch mode? No, as long as the idea is to use changed module, as you have to find parents between you(a test) and changed module, and wipe them to perform a proper reinitialization.

So - an ability to invalidate a cache line is not enough, for the mocking task we also have to know the cache graph, we could traverse and understand which work should be done.

An API for unloading modules certainly makes sense.

In my opinion - this feature is something missing for a proper code splitting. There are already 100Mb bundles, separated into hundreds of pieces, you will never load simultaneously. But you if will - there is no way to unload them. Eventually, the Page or an Application would just crash.

@giltayar
Copy link
Contributor

giltayar commented Dec 4, 2019

@boneskull - the second case you mentioned, I believe can and should be handled by module "loaders", which are a formal way to do "require hooks" for ESM. These will enable testing frameworks (like sinon and others) to manipulate how ES modules are loaded, and, for example, exchange other modules for theirs.

The spec and implementation for that are actively being discussed and worked on by the modules working group (see nodejs/modules#351).

@jonerer
Copy link

jonerer commented Feb 19, 2020

I also need this. I'm making a template rendering engine. When generating the compiled template, I read from a custom format and output to a .js file (a standard ES Module). In order to use the file, I just import it. Upon file changes, I would like to re-write the file, clear the import cache and then re-import it.

@devsnek
Copy link
Member

devsnek commented Feb 19, 2020

These all sound like use cases for V8's LiveEdit debug api (https://chromedevtools.github.io/devtools-protocol/v8/Debugger#method-setScriptSource). You can call into it using https://nodejs.org/api/inspector.html. cc @giltayar @boneskull

@georges-gomes
Copy link

+1 for unloading ES Modules.
It's hard to make Hot Module Reload otherwise. Not for production but for development tools.
And using a ?query=x doesn't seem to work on file node 13.11.0 at least.
Thanks

@georges-gomes
Copy link

@devsnek Can you provide a little example or pseudo-code on usage of setScriptSource. I have been researching for an 1hour without progress. Thanks

@georges-gomes
Copy link

@devsnek ok I progressed, I will post my findings back

@devsnek
Copy link
Member

devsnek commented Mar 20, 2020

@georges-gomes you can subscribe to the Debugger.scriptParsed event to track the script id, and then when you need to modify the script you can call Debugger.setScriptSource.

@jonerer
Copy link

jonerer commented Mar 20, 2020

@georges-gomes If you are successful, I would be very grateful if you could post a short description on how you could use setScriptSource to solve this problem. On a blog post or something.

@laverdet
Copy link
Contributor

laverdet commented Nov 3, 2023

that’s just like clearing the cache in a browser

It's not a good comparison, since the browser cache doesn't affect correctness. I mentioned that here, that I think "cache" is a poor choice of name for this since it is a matter of correctness and not performance.

I think the only consequence would be that a subsequent import might produce a different result rather than getting the previously cached result, but that’s also similar to a browser cache having been cleared.

Clearing the browser cache of a running web page will not affect the result of import(). It has no observable effect, except on timing.

I know the spec says that subsequent imports must return the same result, but I assume that that means “in general usage,” not if the user has specifically instructed the runtime to not return a cached result.

The language in the specification is clear, and our proposed function is a violation of the invariants expected of the host. As a matter of personal style I am totally ok violating any rule for any reason, but I do want to make sure we're on the same page that this is a dangerous and powerful violation of the specification.

@GeoffreyBooth
Copy link
Member

I do want to make sure we're on the same page that this is a dangerous and powerful violation of the specification.

I mean, the while point of the module customization hooks is to allow the user to violate spec however they please (that Node can achieve). It's not like importing CoffeeScript is spec compliant. If we're worried about dependencies using this API to mischievous ends we could gate it behind a flag or create a permission for it; but I'm not sure what the risk is.

@laverdet
Copy link
Contributor

laverdet commented Nov 4, 2023

It's not like importing CoffeeScript is spec compliant.

It is, though. CoffeeScript would be a Cyclic Module Record (a module which participates in the specified cyclic resolution algorithm). Cyclic Module Record is described in the specification as "abstract".

Source Text Module Record, which is the ES Modules that we all know, is a concrete implementation of that abstract interface. So a CoffeeScript module would be a separate but valid implementation of a Cyclic Module Record.

If you go up one level there is a plain Module Record, from which Cyclic Module Record is implemented. This provides the means for modules which don't participate in the cyclic resolution algorithm (wasm, json, or the whole node: scheme). Neither CoffeeScript, WASM, or TypeScript are specified in ES262 but they are still spec compliant.

Anyway, you are right that the loaders API provides the means to break specification, since the result of resolve is neither pure nor is it memoized. This breaks the invariants of HostLoadImportedModule: "If this operation is called multiple times with the same (referrer, specifier) pair [then it must resolve] with the same result each time". I suppose the difference here is that we are proposing a globally-importable utility which could be used outside of a loader.

but I'm not sure what the risk is

The risks are impossible to enumerate since we're reneging on an presumed invariant. Like what happens if a user attempts to evict node:fs? Idk, will it segfault?

Anyway the risks are probably mostly hypothetical in nature. I'll try and open a PR soon to continue the discussion.

@laverdet
Copy link
Contributor

I'm abandoning the PR at #50618. After more reflection I think this is going to harm the ecosystem more than it will do any good. From what I can tell there are 3-4 different use-cases mentioned in this issue which can be solved in other ways:

Q: "How do I retry a module whose file content was created after a failed import"
A: Just reimport it with a cache bust: await import("failed-module?retry=1")

Q: "How do I reload a module and all its dependencies"
A: Use a loader which carries forward the cache bust: #50618 (comment)

Q: "How do I add module support to unit test frameworks"
A: Use vm.SourceTextModule

Q: "How do I hot reload modules during development"
A: Use dynohot

@pygy
Copy link

pygy commented Jan 18, 2024

@laverdet Does this look fine to you (this is a reworked version of #50618 (comment))?

https:/pygy/esm-reload

@laverdet
Copy link
Contributor

@pygy The loader code looks correct and does what it says it does.

import * as mDev from './my-module.js?dev'
process.env.NODE_ENV='production'
import * as mProd from './my-module.js?prod'

This counter-example from the documentation doesn't make sense though. Imports are not executed in an imperative manner, so the body of ./my-module.js?prod will actually run before process.env.NODE_ENV='production' is evaluated. In fact you can't even guarantee that ./my-module.js?dev will run before ./my-module.js?prod without also looking at the rest of the module graph.

@pygy
Copy link

pygy commented Jan 18, 2024

Good catch, thanks this should have been dynamic imports.

Fixed, and I added an example with dependencies for extra clarity:


With dependencies

Suppose these files:

// foo.js
export {x} from "./bar.js"

// bar.js
export const x = {}

We can then do

import "esm-reload"

const foo1 = await import("./foo.js?instance=1")
const bar1 = await import("./bar.js?instance=1")

const foo2 = await import("./foo.js?instance=2")
const bar2 = await import("./bar.js?instance=2")

assert.equal(foo1.x, bar1.x)
assert.equal(foo2.x, bar1.x)

assert.notEqual(bar1.x, bar2.x)

Edit again: https://www.npmjs.com/package/esm-reload

Copy link
Contributor

There has been no activity on this feature request for 5 months. To help maintain relevant open issues, please add the never-stale Mark issue so that it is never considered stale label or close this issue if it should be closed. If not, the issue will be automatically closed 6 months after the last non-automated comment.
For more information on how the project manages feature requests, please consult the feature request management document.

@github-actions github-actions bot added the stale label Jul 17, 2024
@simlu
Copy link

simlu commented Jul 17, 2024

Should probably close this as "wont do"?

@github-actions github-actions bot removed the stale label Jul 18, 2024
@lzxb
Copy link

lzxb commented Aug 14, 2024

My example

import fs from 'node:fs';
import { isBuiltin } from 'node:module';
import path from 'node:path';
import { fileURLToPath } from 'node:url';
import {
    createContext,
    type Module,
    type ModuleLinker,
    SourceTextModule,
    SyntheticModule
} from 'node:vm';

const ROOT_MODULE = '__root_module__';

const link: ModuleLinker = async (specifier: string, referrer: Module) => {
    // Node.js native module
    const isNative = isBuiltin(specifier);
    // node_modules
    const isNodeModules =
        !isNative && !specifier.startsWith('./') && !specifier.startsWith('/');
    if (isNative || isNodeModules) {
        const nodeModule = await import(specifier);
        const keys = Object.keys(nodeModule);
        const module = new SyntheticModule(
            keys,
            function () {
                keys.forEach((key) => {
                    this.setExport(key, nodeModule[key]);
                });
            },
            {
                identifier: specifier,
                context: referrer.context
            }
        );
        await module.link(link);
        await module.evaluate();
        return module;
    } else {
        const dir =
            referrer.identifier === ROOT_MODULE
                ? import.meta.dirname
                : path.dirname(referrer.identifier);
        const filename = path.resolve(dir, specifier);
        const text = fs.readFileSync(filename, 'utf-8');
        const module = new SourceTextModule(text, {
            initializeImportMeta,
            identifier: specifier,
            context: referrer.context,
            // @ts-expect-error
            importModuleDynamically: link
        });
        await module.link(link);
        await module.evaluate();

        return module;
    }
};

export async function importEsm(identifier: string): Promise<any> {
    const context = createContext({
        console,
        process,
        [ROOT_MODULE]: {}
    });
    const module = new SourceTextModule(
        `import * as root from '${identifier}';
        ${ROOT_MODULE} = root;`,
        {
            identifier: ROOT_MODULE,
            context
        }
    );
    await module.link(link);
    await module.evaluate();
    return context[ROOT_MODULE];
}

function initializeImportMeta(meta: ImportMeta, module: SourceTextModule) {
    meta.filename = import.meta.resolve(module.identifier, import.meta.url);
    meta.dirname = path.dirname(meta.filename);
    meta.resolve = import.meta.resolve;
    meta.url = fileURLToPath(meta.filename);
}

Use it

const module = await importEsm('filename');

@RomainLanz
Copy link
Contributor

If you are interested, we created Hot Hook to hot reload node imports during development.

https://adonisjs.com/blog/hmr-in-adonisjs
https://docs.adonisjs.com/guides/concepts/hot-module-replacement
https:/julien-R44/hot-hook

@Filipoliko
Copy link

Filipoliko commented Oct 4, 2024

Hi, the ability to invalidate cache when using import (or in my case even require) seems to be very important to be able to use the new experimental mock.module functionality from the native Node test runner.

https://nodejs.org/api/test.html#mockmodulespecifier-options

Here is a use-case, where you can run into a trouble.

fs-extended.mjs

import fs from 'node:fs';

function getFs() {
    return fs;
}

export { getFs };

my.test.mjs

import assert from 'node:assert';
import { mock, test } from 'node:test';

// This works well
test('Mock shallow module', async () => {
    for (let i = 0; i < 2; i++) {
        const fsMock = mock.module('node:fs', { namedExports: { writeFileSync: mock.fn(() => i) } });

        const fs = await import('node:fs');

        assert.strictEqual(fs.writeFileSync(), i);

        fsMock.restore();
    }
});

// This fails
test('Mock deep module', async () => {
    for (let i = 0; i < 2; i++) {
        const fsMock = mock.module('node:fs', { namedExports: { writeFileSync: mock.fn(() => i) } });

        const { getFs } = await import('./fs-extended.mjs');
        const fs = getFs();

        assert.strictEqual(fs.writeFileSync(), i);

        fsMock.restore();
    }
});

Result:

test at my.test.mjs:16:1
✖ Mock deep module (4.481709ms)
  AssertionError [ERR_ASSERTION]: Expected values to be strictly equal:
  
  0 !== 1
  
      at TestContext.<anonymous> (file:///Users/filip.satek/git/cns-cli/my.test.mjs:23:16)
      at async Test.run (node:internal/test_runner/test:931:9)
      at async Test.processPendingSubtests (node:internal/test_runner/test:629:7) {
    generatedMessage: true,
    code: 'ERR_ASSERTION',
    actual: 0,
    expected: 1,
    operator: 'strictEqual'
  }

Even though I called fsMock.restore(), in the second iteration, the getFs call still gave me the node:fs module from the first iteration.

I hope this is related to this issue, otherwise please let me know and I will file a new issue for my use-case.

Edit: Tested on Node v22.9.0 using node --test --experimental-test-module-mocks my.test.js command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
esm Issues and PRs related to the ECMAScript Modules implementation. feature request Issues that request new features to be added to Node.js.
Projects
Status: Awaiting Triage
Development

No branches or pull requests