Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

emit error events consistently in next tick #5251

Closed

Conversation

thefourtheye
Copy link
Contributor

Similar to callbacks, error events also have to be emitted in the
nextTick, so that they will be guaranteed to work asynchronously. This
patch makes sure all the error events

  1. use arrow functions consistently (this is chosen because they
    maintain the context and that would be useful more often)
  2. are emitted in the process.nextTick

I added major label as well, as this changes the behaviour of the existing system.

Also, this removes few TODOs left by @isaacs and @bnoordhuis

@thefourtheye thefourtheye added semver-major PRs that contain breaking changes and should be released in the next major version. lib / src Issues and PRs related to general changes in the lib or src directory. labels Feb 15, 2016
@@ -363,6 +363,9 @@ function Zlib(opts, mode) {
var error = new Error(message);
error.errno = errno;
error.code = exports.codes[errno];
// TODO: Can't defer the error emission to nextTick because both the sync
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrisdickinson any idea how this can be solved? There is a chance that I might have totally misunderstood the problem here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can create two onerror handlers, one for the sync code and one for the async code:

The code in

node/src/node_zlib.cc

Lines 344 to 367 in 0b3936b

static void Error(ZCtx* ctx, const char* message) {
Environment* env = ctx->env();
// If you hit this assertion, you forgot to enter the v8::Context first.
CHECK_EQ(env->context(), env->isolate()->GetCurrentContext());
if (ctx->strm_.msg != nullptr) {
message = ctx->strm_.msg;
}
HandleScope scope(env->isolate());
Local<Value> args[2] = {
OneByteString(env->isolate(), message),
Number::New(env->isolate(), ctx->err_)
};
ctx->MakeCallback(env->onerror_string(), ARRAY_SIZE(args), args);
// no hope of rescue.
if (ctx->write_in_progress_)
ctx->Unref();
ctx->write_in_progress_ = false;
if (ctx->pending_close_)
ctx->Close();
}
would need to be duplicated (and shared through utility functions) for the sync and async versions - currently they both use Zlib::Error - they'd have to use ::Error and ::ErrorAsync and places that call ::Error would have to be modified.

I think it would be a shame to block this PR on that though and it would require a lot more extensive review.

@thefourtheye
Copy link
Contributor Author

@@ -328,7 +325,7 @@ function socketOnEnd() {
if (!req.res && !req.socket._hadError) {
// If we don't have a response then we know that the socket
// ended prematurely and we need to emit an error on the request.
req.emit('error', createHangUpError());
process.nextTick(() => req.emit('error', createHangUpError()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the side effect of changing the stack where the hangup error is created (the new Error is called inside the callback on next tick). This will disassociate the stack trace from it's context. If we're going to do this, I recommend first creating the Error object, then passing it off to the next tick for dispatch and handling...

const err = createHangUpError();
process.nextTick(() => req.emit('error', err));

@thefourtheye
Copy link
Contributor Author

@jasnell You are correct. Fixed them now. PTAL.

@thefourtheye
Copy link
Contributor Author

cc @nodejs/ctc, as it is a major change.

@@ -585,14 +585,12 @@ function onhandshakestart() {
ssl.lastHandshakeTime = now;
if (first) return;

if (++ssl.handshakes > tls.CLIENT_RENEG_LIMIT) {
if (++ssl.handshakes > tls.CLIENT_RENEG_LIMIT && self.cleartext) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this can be safely changed (e.g. self.cleartext's value won't change between here and the next tick)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that it's likely inside the setImmediate() because it could potentially change by that point. I'd also be very wary changing setImmediate() to process.nextTick() for this one. Needs input from @nodejs/crypto I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also -1 on this change unless we can verify why it was setImmediate and not nextTick before. @indutny any reason you chose to setImmediate here?

Also note that this method fired the event in the next tick anyway so in terms of consistency I think this is fine the way it is.

@rvagg
Copy link
Member

rvagg commented Feb 17, 2016

/cc @nodejs/streams since there's some adjustments to streams error timing

I like the idea here but it's pretty high risk.

@benjamingr
Copy link
Member

Looks good. It's definitely a breaking change though. Namely - things that would crash the process by default now do not crash the process by default.

@@ -447,7 +447,8 @@ function maybeReadMore_(stream, state) {
// for virtual (non-string, non-buffer) streams, "length" is somewhat
// arbitrary, and perhaps not very meaningful.
Readable.prototype._read = function(n) {
this.emit('error', new Error('not implemented'));
const err = new Error('not implemented');
process.nextTick(() => this.emit('error', err));
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all of the above stream methods are very hot performance wise. what's the perf implications of introducing the closures? /cc @mcollina

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(except the default _read function of course)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mafintosh sorry for the ignorance, but why would error paths ever be hot?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in v8 all functions are "hoisted" no matter where in a function they were declared. or at least that used to be the case in v8 unless they changed how that works

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm unsure about the streams error handling implications of this though as well. this will make the state machine more complicated

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I missed the fact your comment addressed all the function ode above and not just _read.

So - move it to a named function that takes the error as a parameter to avoid the closure?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yea sorry about that. that probably works. also running the benchmarks will probably show if this is still a problem in general. @mcollina is more of an expert on this than I am

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the suggestion is to create a utility function which will emit error events?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, I would rewrite all of those as

function myFunc () {
  let err = new Error('myerror')
  nextTick(emitError, this, err) 
  return
}

function emitError (stream, err) {
  stream.emit('error', err)
}

This should be the best way to handle this. But run the benchmarks before and after.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just thinking the same thing as I was scanning the changes. I guess the function could be stored in internal/util.js perhaps?

@mcollina
Copy link
Member

This is a massive change. I really like the idea, because it will clearly simplify how things are done. We should probably run this change through some applications/real-world use case to spot regressions early.

Also, see my comment about having a separate function for emitting the error.

@Fishrock123 Fishrock123 added the discuss Issues opened for discussions and feedbacks. label Feb 17, 2016
@jasnell
Copy link
Member

jasnell commented Feb 19, 2016

CITGM: https://ci.nodejs.org/job/thealphanerd-smoker/87/ /cc @thealphanerd

Nevermind... lol that didn't run like I expected it to. @thealphanerd can you queue up a citgm run on this?

@jasnell
Copy link
Member

jasnell commented Feb 21, 2016

In general this LGTM if CI is green. I'd like to see the CITGM impact tho before it lands.

@mcollina
Copy link
Member

I'm -1 on its current form, as closure creation should be avoided in hot paths. See #5251 (comment) on how to addess this. I can send a PR to this branch if needed.

@jasnell
Copy link
Member

jasnell commented Feb 21, 2016

Looking at it again, I agree with @mcollina ... so LGTM with that change made

@mafintosh
Copy link
Member

Is CI running with the stream modules that @thealphanerd added? This might have implications for streams error handling

@MylesBorins
Copy link
Contributor

here's citgm... with stream modules included
https://ci.nodejs.org/job/thealphanerd-smoker/89/

@mcollina
Copy link
Member

It does not seem the tests for dependent modules are passing: https://ci.nodejs.org/job/thealphanerd-smoker/89/nodes=ubuntu1404-64/console. Is there a more human readable thing to understand who is causing them to fail?

@isaacs
Copy link
Contributor

isaacs commented May 20, 2016

If we decide to land this, a lot of module authors would not be happy. However, I think we might want to do it anyway.

Making module authors unhappy is a bad idea. Not just because module authors are a vocal and important part of the Node community, but because they're a good proxy for production Node.js users.

Think about why it is that Streams had to be excluded from this PR.

Errors on an object must be able to prevent further interaction with that object, unless the error is handled. That's the point of having unhandled error events throw.

With streams, that is an obvious problem, because streams make a lot of assumptions about the order in which events and actions happen, so that data can be serialized. But Sockets and HTTP agents also make assumptions about event ordering, they just do so in ways that tend to be less precise than streams, and less easy to test. Especially in high-load situations, I've seen the ordering of events in an HTTP conversation to sometimes be very relevant. So it's a bigger problem, but a less obvious one, and only in cases that are challenging to debug.

I fear that this change will cause some very subtle and significant problems for production applications.

@chrisdickinson
Copy link
Contributor

There's a lot of stackcallsite-dropping happening as a result of this patch.

/cc'ing @nodejs/post-mortem based on this.

@mcollina
Copy link
Member

@isaacs the other option is making all of them emit synchronously. The current situation where some of them are wrapped in nextTick and some aren't is confusing. Either options are fine, as long as it's consistent and predictable.

wrapping Errors in nextTick is needed because some time the user had not added an error handler just yet, because:

  1. the user request some resource to be allocated
  2. resource is allocated
  3. resource needs throwing
  4. the user did not have a chance add an on('error')  handler, because all of that happened synchronously

@isaacs how would you handle the above case?

Think about why it is that Streams had to be excluded from this PR.

That might be the main reason why the general rule "errors should be emitted in the next tick" should not be enforced, but rather discouraged.

IMHO a lot of code out there relies on "zalgo" behavior and assumption about event-ordering that are not necessarily true and part of the API.

@sonewman
Copy link
Contributor

-1 this seems way too high risk. Also it could lead to very hard to debug issues, in the case of generic errors there would be no stack trace back to the original stream.

@mcollina
Copy link
Member

Also it could lead to very hard to debug issues, in the case of generic errors there would be no stack trace back to the original stream.

That is not correct, the Errors are allocated with the right stacktrace attached, only the emit('error', err) is deferred.

@sonewman
Copy link
Contributor

That is garenteed in all cases?

@sonewman
Copy link
Contributor

Ok I checked the code more thoroughly. I guess the major thing then is that all stream users and implemented must handle error events (obviously that should be done anyway) but it's a big responsibility to be remembered in all cases

@mcollina
Copy link
Member

@sonewman streams are not touched here. Everything else is. And users have already the responsibility of adding on('error') handlers everywhere, so this is not what we are discussing.

@misterdjules
Copy link

@chrisdickinson Thank you for the heads up, it is very much appreciated!

On:

There's a lot of stackcallsite-dropping happening as a result of this patch.

/cc'ing @nodejs/post-mortem based on this.

My perspective is that emitting errors synchronously or asynchronously doesn't matter too much for users of post-mortem debuggers. Errors that are emitted (as opposed to thrown) should represent operational errors and currently, when a process aborts due to an emitted error, the reason is that the programmer did not handle an operational error, which is the actual programmer error.

When inspecting a core file generated from a process that aborted due to an emitted error without an error listener, a post-mortem debugger user can take a look at the call stack and see that they forgot to handle an operational error and fix the problem.

This represents my opinion and not the opinion of the broader @nodejs/post-mortem working group. I encourage them to voice their opinion if it is different than the one I just described.

Now I'm not sure what this PR fixes. The original comment says:

Similar to callbacks, error events also have to be emitted in the nextTick, so that they will be guaranteed to work asynchronously.

Why do they have to be emitted from nextTick's callback and why do they have to guarantee that they "work asynchronously"?

Is it useful to defer the emission of error events, even when they are emitted from an asynchronous operation's callback? It seems that deferring the emission of error events is useful when it's the only way to allow users to attach a listener for such events, which AFAIK is already done without these changes, but I'm not familiar with other use cases.

@isaacs had provided very useful details of the motivation for an older issue with similar goals. I would think that we'd want to see examples of such use cases that are solved by this change in the description of this PR and in new test files. Otherwise I'm not sure we collectively share an understanding of what this PR fixes and its impact on the ecosystem.

@sonewman
Copy link
Contributor

@mcollina I seem to have got the wrong end of the stick completely. My apologies. I think I misinterpreted the discussion of consistency and the stream examples above.

The one place I think this would trip me up is in the case of client request. If I was piping to it and for some reason an abort was needed, from what I understand this would mean abort would no longer throw (which is good - all errors handled in one place) is there some way to detect that behaviour change between node versions?

@sonewman
Copy link
Contributor

  • obviously the version number of node could be checked so maybe yet another dumbass question from me.

@jasnell
Copy link
Member

jasnell commented May 20, 2016

Given the conversation and the concerns, it would likely be best to actually break this up and look at each of the proposed changes individually as separate semver-majors. It would be significantly easier to reason about possible regressions.

@isaacs
Copy link
Contributor

isaacs commented May 20, 2016

Yes, I opened that issue three years ago. If I remember correctly, discussion led to the conclusion that it was probably a massive undertaking likely to cause some subtle and strange errors in userland code, so we didn't do it.

It's a shame that discussion didn't make it to the issue.

@mcollina
Copy link
Member

Taking a step back... what benefit this change will have to the overall node community? I see plenty of technical reasons (on which I agree), but very few example of how this change will impact on a day-to-day experience.
The current situation works.

@benjamingr
Copy link
Member

Taking a step back... what benefit this change will have to the overall node community? I see plenty of technical reasons (on which I agree), but very few example of how this change will impact on a day-to-day experience.

Well, it would prevent really strange timing errors where you attach the error handler synchronously, it would make learning how to work with the APIs arguably easier and it would mean you don't have to distinguish sync and async errors (effectively helping with "zalgo").

@trevnorris
Copy link
Contributor

We should always be emitting errors in a nextTick(). It needs to be done some places for sanity. e.g.:

const server = net.createServer(c => {}).listen(8080);
server.on('error', (e) => {
  // Passing only port, bind() happens immediately which would cause an
  // error to be thrown since we didn't have time to attach the error handler.
});

And so for consistency it needs to be done everywhere. As someone teaching a friend to use node, these discrepancies in timing make teaching significantly more complicated.

@rvagg rvagg force-pushed the master branch 2 times, most recently from c133999 to 83c7a88 Compare October 18, 2016 17:01
@mscdex mscdex added errors Issues and PRs related to JavaScript errors originated in Node.js core. and removed error-handling labels Oct 28, 2016
@jasnell
Copy link
Member

jasnell commented Feb 28, 2017

ping @thefourtheye ... still want to pursue this?

@jasnell jasnell added the stalled Issues and PRs that are stalled. label Mar 1, 2017
@fhinkel
Copy link
Member

fhinkel commented May 23, 2017

This hasn't been touched for quite a while. I'm going to close this. Feel free to re-open if it shouldn't be closed!

@fhinkel fhinkel closed this May 23, 2017
@mcollina
Copy link
Member

@thefourtheye @fhinkel if we still want this, I can have a look. I really think we should get this through, and some of is improved in #12925.

@benjamingr
Copy link
Member

@mcollina
While personally I think this is the way to go - I think that this passing would raise nodejs/post-mortem concerns, and I don't feel like we should "sneak" this in.

Emitting error events later would mean core dumps would not represent the call stack. While this is already the case with a good chunk of how we emit error events - given the amount of push-back against promises emitting errors in next tick - landing this under true consensus would be a very challenging task.

@gibfahn
Copy link
Member

gibfahn commented May 24, 2017

While personally I think this is the way to go - I think that this passing would raise nodejs/post-mortem concerns,

cc/ @nodejs/post-mortem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issues opened for discussions and feedbacks. errors Issues and PRs related to JavaScript errors originated in Node.js core. lib / src Issues and PRs related to general changes in the lib or src directory. semver-major PRs that contain breaking changes and should be released in the next major version. stalled Issues and PRs that are stalled.
Projects
None yet
Development

Successfully merging this pull request may close these issues.