Skip to content
This repository has been archived by the owner on Sep 2, 2023. It is now read-only.

Locking down the "process" and "Buffer" globals #235

Closed
guybedford opened this issue Nov 27, 2018 · 30 comments
Closed

Locking down the "process" and "Buffer" globals #235

guybedford opened this issue Nov 27, 2018 · 30 comments

Comments

@guybedford
Copy link
Contributor

ES modules provide a strong encapsulation. It can be guaranteed that a module only has access to the global and imported bindings. This means module-level security becomes a very real possibility by providing import permissions per-module, which is quite exciting (and possible future directions for this group and related modules work in Node).

One thing that puts a spanner in all this is the process and Buffer globals in Node.js. They are always available in ES modules and if we ship modules with these then they could turn out difficult to deprecate. For example, process gives access to high-resolution timers, OS details, base-level hooks and all native bindings. All of these are huge access vectors inhibiting security of modules.

I previously attempted to lock down these globals in nodejs/ecmascript-modules#5 but this was shot down pretty quickly for being a bad approach in terms of performance.

I'd like to reopen this discussion though, because if we can stop the ecosystem from using the process and Buffer globals, this will put us on a strong path for enabling modular security in Node.js going forward, and there will be no easier time to make this change than in the switch to modules. Once code in the wild relies on this stuff, deprecation gets much harder.

@ljharb
Copy link
Member

ljharb commented Nov 27, 2018

It can be guaranteed that a module only has access to the global and imported bindings.

A module also has access to every global variable, which in both node and browsers is a large number. I believe you can also do Function('foo = 3')(); console.log(foo) in a Module and 3 will be logged.

I don't think that it's possible to lock down any globals via modules, nor do I think it would be in any way a good idea to make different non-module-related globals available in CJS vs ESM.

The proper way to make this sort of change is in CJS itself, at which point, ESM would be able to leverage the change for free.

@devsnek
Copy link
Member

devsnek commented Nov 27, 2018

this isn't actually a change to be made to modules, its a change to be made to regular node core. you would have to remove process and Buffer from the global object, possibly re-injecting them in cjs.

@jkrems
Copy link
Contributor

jkrems commented Nov 27, 2018

Wouldn't it be better to run un- or semi-trusted code in separate contexts or isolates?

But I'm generally +1 on deprecating the globals in favor of explicit imports/requires everywhere. We could even have a flag that opts into the removal.

@weswigham
Copy link
Contributor

this isn't actually a change to be made to modules, its a change to be made to regular node core. you would have to remove process and Buffer from the global object, possibly re-injecting them in cjs.

I disagree - it's already bad enough that for some reason require and __filename being context-sensitive psuedo-globals is apparently reason for them to be excluded from the esm global scope (unless you read cjs wrapper spec or assume implementation details, you access them like they're globals in your script and can usually assume they mostly behave like fancy globals up to some edge cases) - making more things into psudo-globals isn't the solution - proper node-wide deprecation and removal is. (I really dislike anything that leads to extra divergence between cjs and esm scripts) The global scope is the global scope - what appears to be global today should continue to appear to be global in a module.

And I think @jkrems is right - if you really want module-level security, you should really be trying to run untrusted code in a new context/isolate. Proxy or wrapper based sandboxing is a kind of half-effort, and doesn't prevent a large class of issues (ie, denial of service through infinite work loops).

It can be guaranteed that a module only has access to the global and imported bindings.

That's doesn't seem to be quite true, if I'm understanding you right. Just like how a require function can be passed around in cjs, so, too, can a dynamic import - I can create a module such as export default id => import(id) and then

import fromOther from "./other";

export let x;

fromOther("some-other-package").then(mod => x = mod);

@devsnek
Copy link
Member

devsnek commented Nov 27, 2018

@weswigham the key of my statement was "its a change to be made to regular node core. you would have to remove process and Buffer from the global object". i agree that we shouldn't make more pseudo-globals. my overarching point is that this wouldn't be a change the modules group makes, it would be a change node collaborators in general make because it would be a change to all of node.

@robpalme
Copy link
Contributor

Forcing ESM users to explicitly import "@node/Buffer" or "@node/process" seems like a small impact given users are writing for a whole new module system.

The benefit is significant: we help encourage the creation of Web-compatible modules, increase the visibility of dependencies, and set a clear precedent for a more secure/explicit way of accessing functionality.

I struggle to believe we'll ever get much traction or benefit from eliminating them from CJS modules, so wouldn't want to tie this improvement to changing the CJS world.

@devsnek
Copy link
Member

devsnek commented Nov 27, 2018

@robpalme its not possible to have them on the global object and to not appear in modules. the change, on a technical level, must happen in core/cjs land.

@weswigham
Copy link
Contributor

The benefit is significant: we help encourage the creation of Web-compatible modules, increase the visibility of dependencies, and set a clear precedent for a more secure/explicit way of accessing functionality.

On a non-technical note, I disagree with this point - there's little practical difference between depending on an @node/Buffer import specifier and a potentially polyfilled global Buffer (in both cases, in the browser, you'll need to have made additional code to provide Buffer available - either via the import or the global). Except you can't feature detect the import for a conditional fallback (this is a known downside to static imports) without relying on dynamic import, which taints your program with asynchronicity. So in some ways, an import is worse than a global. The usual reason to avoid globals is to avoid conflicting globals, but node has implicitly solidly reserved process and Buffer in the global scope with it's popularity already - it's not much of a concern here.

@robpalme
Copy link
Contributor

@devsnek We could solve this by using a separate global object for ESM. Is this what you had in mind @guybedford?

@weswigham I think there is a practical difference in the way the browser error is surfaced to users: load time vs potentially late runtime failure. And there's a difference in static analysis to detect usage: it cannot be safely done on globals, whereas you can for import assuming you avoid dynamic import of non-string literals.
Also polyfilling is likely to be solved at the loader level last time I checked import-maps/layered-apis.

@devsnek
Copy link
Member

devsnek commented Nov 28, 2018

@robpalme

We could solve this by using a separate global object for ESM

that's not... a thing. the global is the global. you would need to propose a spec change which adds... "non-global globals"? i'm not really sure what the semantics of this would be but it sounds like a rabbit hole of evil.

@ljharb
Copy link
Member

ljharb commented Nov 28, 2018

I don’t think ESM should be taken as an opportunity to clean up unrelated things you don’t like in node. If something is subpar, it should be fixed for all node users.

@guybedford
Copy link
Contributor Author

guybedford commented Nov 28, 2018

Here's a new proposal for Node in general that could work. Please can everyone let me know feedback and if you would support this and I can begin the process of bringing this into Node.js core. I'm glad we're all coming around on security awareness :)

@ljharb @devsnek you're exactly right - so, yes the tricky part here is that ES modules don't have a "separate" global. So that's why the previous approach needed proxies and magic to get it to work.

  1. Add "process" and "Buffer" into the CJS wrapper itself, so that they are locals. This means code assuming process or Buffer in CJS can continue to work always. Breaking change risk: Assignments to process and Buffer will no longer override the global but only the module-scope values.

  2. Then start a deprecation path for process and Buffer on global by adding a warning to the global.process and global.Buffer getters.

  3. Land modules alongside the above deprecation warnings so that we can start to build the modules ecosystem without these globals, allowing a better security footing moving forward.

  4. Eventually remove global.Buffer and global.process entirely.

@devsnek
Copy link
Member

devsnek commented Nov 28, 2018

@guybedford seeing as we're on the same page, i think you should bring this issue to node core, since there are no actual changes to modules happening here.

@guybedford
Copy link
Contributor Author

Thanks @devsnek I wanted to at least have some support from the modules group in this direction as it will be somewhat of an undertaking, but certainly I will do that!

@bmeck
Copy link
Member

bmeck commented Nov 28, 2018

Per the separated globals, non-context based isolation is being looked at by Realms and things like the WindowProxy effectively are being used in other scenarios where the global is shared but given different views.

Realms is looking at this to help with what they call "identity discontinuity" which is the problem where Array is not shared between isolated pieces of code so arr instanceof Array is unreliable. On last weeks call the Realms group were thinking of making a JS spec PR to formally allow hosts to setup per-source text GlobalThisValue instead of a single shared one.

WindowProxy is... complicated, but not going to change anytime. It is unknown if the change from the Realms group has any problems with it, but it seems unlikely.

If someone wanted to change the behavior of globals I'd suggest they get together with the Realms group (feel free to email myself or @erights). I'd agree with @devsnek that this might better be done outside of the Modules WG.

@ljharb
Copy link
Member

ljharb commented Nov 28, 2018

(Note that i don’t think process should ever be deprecated or unavailable in ESM, as having an environment-specific global like window or process is absolutely essential for feature detection use cases - but we can discuss that in core)

@guybedford
Copy link
Contributor Author

@ljharb agree the feature detection use cases need to be handled for ESM, but feature detection use cases certainly don't require access to base-level application hooks, deep process information, all native bindings and high resolution timers.

@SMotaal
Copy link

SMotaal commented Nov 29, 2018

I recall having to work with import process from 'process' at some point in exm but cannot remember the context for that except that I after tirelessly exploring the story of "platform" globals and how module contexts created the first real (and non-breaking) opportunity to safeguard against inadvertent contamination. Not to digress, though, my hope was that once ESM stabilized enough to address secondary details, that new solutions for "platform" bindings can be offered for modules (ie also non-breaking).

So my take was a special specifier-less syntax which comes with the added benefits of returning undefined (and potentially even safe destructuring if appropriate). I pick this example to quickly outline the depth of considerations involved:

import window, process, console as globalConsole; // safely used for cross-platform 

const locals = {console: typeof console === 'object' && globalConsole !== console && console || undefined};
const globals = {window, process, console: globalConsole};

if (!globals.console && !locals.console) 
  throw Error('Unsupported platform: access to console is required');

if (globals.console) globals.console.log(['[global console]', {globals, locals});
if (locals.console) locals.console.log('[local console]', {globals, locals}); // when local !== global

So to keep it short, and I am sure others have thought of much more elegant ideas for globals inside and outside the context of modules. Given the successful outcomes of module context and how well off ESM is now, is it maybe time to consider similar proposals?

@SMotaal
Copy link

SMotaal commented Nov 29, 2018

@bmeck on "identity discontinuity" I think I was also hitting the flip-side of that when shimming a realm to simulate module bindings.

So while obviously instanceof is the primary issue for literals crossing realms, a secondary concern is that it seemed more intuitive (at least for me) to think that literals like […] and {…} would somehow construct from the Array and Object in their root (for a lack of a better word) scope - ie realm, module context, global, or even an object context with a theoretical explicitly non-piercing designation, which is predetermined and intentionally excludes dealing with the immediate scope for obvious performance pitfalls.

While this is not a well thought out idea at the moment, but it draws upon the evolving (and much more endorsable) practice of certain libraries wanting to stop overloading those globals and moving instead to:

import overloads, {Array, Object} from 'library/overloads'; // module scope
import globals from 'library/globals';

{
  const literalObject1 = {};
  // expect: literalObject1.constructor !== globals.Object

  const overloadedObject = new overloads.Object();
  // expect: literalObject1.constructor === overloadedObject.constructor

  {
    const {Object} = globals;

    const literalObject2 = {};
    // expect: literalObject2.constructor === literalObject1.constructor
    
    const globalObject = new globals.Object();   
    // expect: literalObject2.constructor !== globalObject.constructor

    // I don't feel comfortable with the next one yet but maybe others do

    theoretically: with (globals) {  const literalObject3 = {}; }
    // expect: literalObject3.constructor === globalObject.constructor

  }
}

// destructuring `const {Object} = …` here does not affect literals only 
// local reference to Object so `new Object()` maybe different from `{}`

@bmeck
Copy link
Member

bmeck commented Nov 29, 2018

@SMotaal

Identity discontinuity is the main perpetrator for a variety of things since it is often found when people use instanceof if you look into the problems GraphQL had that made it roll back away from dual mode modules, they were due to instanceof and identity discontinuity.

Given the successful outcomes of module context and how well off ESM is now, is it maybe time to consider similar proposals?

This idea of new syntax would be fine to bring to TC39, though a lot of the builtin Module friction is about ensuring that hosts and polyfills can both have fallbacks for any sort of builtins to the language that are loaded via the Module system. You would need to find a champion.

So while obviously instanceof is the primary issue for literals crossing realms, a secondary concern is that it seemed more intuitive (at least for me) to think that literals like […] and {…} would somehow construct from the Array and Object in their root (for a lack of a better word) scope - ie realm, module context, global, or even an object context with a theoretical explicitly non-piercing designation, which is predetermined and intentionally excludes dealing with the immediate scope for obvious performance pitfalls.

I don't really follow, lots of things don't defer to the global constructor functions when being initialized. https://tc39.github.io/ecma262/#sec-array-initializer-runtime-semantics-evaluation for example uses something that cannot be "virtualized" (~ replaced).

The ideas you are posing sound similar to the Realms API and Compartments but mixing with lexical scoping syntax. I'd be wary of that since you can get into some non-trivial to read discontinuity.

let a;
with(globals) { // assuming it overwrote the primordials and not just constructors
  a = [];
  console.log(a instanceof Array); // true
}
console.log(a instanceof Array); // false

Easy mistakes that fall back to what Realms are trying to avoid have led the Realms APIs to move increasingly to a message passing interface set of design patterns. Direct mixing of first class values or syntax has proven hard to narrow down to a point of reasonable usability even when people are attempting to be careful.


I'd suggest looking into the Realms API efforts before proposing anything new to the language on this front since they have encountered many problems in this area and can give feedback more than I can as a single person.

@SMotaal
Copy link

SMotaal commented Dec 1, 2018

@bmeck great point... I point out examples in case someone else has been working towards a proposal because it is helps make my point less vague.

I followed realms sparsely in the past and was eager to catch up again given their lastest and hopefully final spur. Yes, the last thing we want is another proposal getting in the way of finally getting it through.

Are you aware of people who are looking into importing of built-ins (static or dynamic)?

@bmeck
Copy link
Member

bmeck commented Dec 3, 2018

@SMotaal no dynamically generated import values are being looked at to my knowledge. A new importable standard library might be what you are looking for? Though to make it safe against a variety of historical issues the actual library itself will likely not be usable in the same sense that most existing built-ins are due to things like freezing and shared identity across Realms.

@SMotaal
Copy link

SMotaal commented Dec 4, 2018

@bmeck this is perfect — not sure how I missed it till now 😄

@Fishrock123
Copy link
Contributor

my 2c:

  • Buffer should never have been global
  • deprecating 'global-like access' for either in CJS contexts is not feasible
    • I recall us trying this at some point for Buffer in early Node / io.js but I can't find any refs...

@guybedford
Copy link
Contributor Author

@Fishrock123 thanks for the feedback - as mentioned my approach was to seek to add deprecation warnings to global.Buffer and global.process access for the next Node, while adding them into the CJS wrapper. The wrapper handling should cover the major case. Would be interesting to hear your concerns further here...

@devsnek
Copy link
Member

devsnek commented Dec 5, 2018

it might be useful to move this conversation to core, for more feedback.

@guybedford guybedford removed the modules-agenda To be discussed in a meeting label Dec 7, 2018
@bmeck
Copy link
Member

bmeck commented Dec 30, 2018

@guybedford do we want to move this to Core or can we close this as a non-Modules WG issue?

@guybedford
Copy link
Contributor Author

I've finally created a Node.js core PR for this in nodejs/node#26334.

@guybedford
Copy link
Contributor Author

I'm adding the module agenda label as we have some concerns on this feature from group members that requires some discussion further.

@bmeck
Copy link
Member

bmeck commented Jan 21, 2021

I'm going to close this due to staleness and hopes that Realms/Compartments solve this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants