Trigger a panic on allocation error on nightly Rust #66347

Aaron1011 · 2019-11-12T22:24:34Z

In issue #66342, we're seeing extremely large allocations by rustc
(4GB for the hex crate, which is only a few hundred lines). This is
exhausing the memory on CI, causing jobs to fail intermittently.

This PR installs a custom allocation error hook for nightly compilers,
which attempts to trigger a panic after printing the initial error
message. Hopefully, this will allow us to retrieve a backtrace when one
of these large spurious allocations occurs.

The hook is installed in librustc_driver, so that other compiler
frontends (e.g. clippy) will get this logic as well.

I'm unsure if this needs to be behind any kind of additional feature
gate, beyond being ngithly only. This only affects compiler frontends,
not generic users of libstd. While this will nake OOM errors on
nightly much more verbose, I don't think this is necessarily a bad
thing. I would expect that out of memory errors when running the
compiler are usually infrequent, so most users will probably never
notice this change. If any users are experiencing #66342 (or something
like it) on their own crates, the extra output might even be useful to
them.

I don't know of any reasonable way of writing a test for this. I
manually verified the implementation by inserting:
let _a: Vec<usize> = Vec::with_capacity(9999999999) into
librustc_driver after the hook installation, and verified that
a backtrace was printed.

If we're very unlucky, it may turn out the large allocation on CI
happens to occur after several large successful allocations, leaving
extremely little memory left when we try to panic. If this is the case,
then we may fail to panic or print the backtrace, since panicking
currently allocates memory. However, we will still print an error
message, so the output will be no less useful (though a little more
spammy) then before.

rust-highfive · 2019-11-12T22:24:38Z

r? @cramertj

(rust_highfive has picked a reviewer for you, use r? to override)

Mark-Simulacrum · 2019-11-12T23:17:53Z

Instead of panicking, could we just capture and print a backtrace? I believe that given std::Backtrace that shouldn't be too hard today, right?

In issue rust-lang#66342, we're seeing extremely large allocations by rustc (4GB for the `hex` crate, which is only a few hundred lines). This is exhausing the memory on CI, causing jobs to fail intermittently. This PR installs a custom allocation error hook for nightly compilers, which attempts to trigger a panic after printing the initial error message. Hopefully, this will allow us to retrieve a backtrace when one of these large spurious allocations occurs. The hook is installed in `librustc_driver`, so that other compiler frontends (e.g. clippy) will get this logic as well. I'm unsure if this needs to be behind any kind of additional feature gate, beyond being ngithly only. This only affects compiler frontends, not generic users of `libstd`. While this will nake OOM errors on nightly much more verbose, I don't think this is necessarily a bad thing. I would expect that out of memory errors when running the compiler are usually infrequent, so most users will probably never notice this change. If any users are experiencing rust-lang#66342 (or something like it) on their own crates, the extra output might even be useful to them. I don't know of any reasonable way of writing a test for this. I manually verified the implementation by inserting: `let _a: Vec<usize> = Vec::with_capacity(9999999999)` into `librustc_driver` after the hook installation, and verified that a backtrace was printed. If we're very unlucky, it may turn out the large allocation on CI happens to occur after several large successful allocations, leaving extremely little memory left when we try to panic. If this is the case, then we may fail to panic or print the backtrace, since panicking currently allocates memory. However, we will still print an error message, so the output will be no less useful (though a little more spammy) then before.

Aaron1011 · 2019-11-13T22:58:25Z

@Mark-Simulacrum: I've updated the PR to explicitly print a backtrace instead of panicking.

Mark-Simulacrum · 2019-11-13T23:13:02Z

I'm going to do r? @alexcrichton here on the general approach -- it seems largely reasonable that we'd want to print a backtrace here (maybe even in general?). I am somewhat concerned by the fact that this is still sticking itself into liballoc instead of being done outside that, which I would've expected to be possible...

Mark-Simulacrum · 2019-11-13T23:13:32Z

Also worth noting that #66342 (comment) makes this not urgent, and as such we may not want it :)

alexcrichton · 2019-11-14T15:18:27Z

Yes can this avoid adding a function to libstd? Can the hook live in the caller? (I think it can just be a closure?)

Aaron1011 · 2019-11-14T17:34:33Z

That would require exposing dumb_print. I think it's better to expose a self-contained alloc error hook.

Mark-Simulacrum · 2019-11-14T17:48:24Z

It seems fine to just use eprintln in this case? AFAICT dumb_print isn't doing anything too special on most platforms we care about for this purpose...

alexcrichton · 2019-11-14T18:04:42Z

Yes dumb_print isn't very special, it should either use eprintln! or just be copied into the rustc crates.

bjorn3 · 2019-11-14T18:08:29Z

eprintln would try to lock stderr. That would cause a deadlock when another thread keeps stderr locked. dumb_print however would write to stderr even if it is being locked.

Mark-Simulacrum · 2019-11-14T18:23:05Z

I think we don't really care about that failure case for the compiler (we're going to die, if we deadlock instead... well, not great, but not too terrible either).

Aaron1011 · 2019-11-14T18:44:31Z

While we can't guarantee that printing a backtrace here will succeed, I don't think we should pick a more-deadlock-prone solution solely to keep a tiny function out of libstd.

alexcrichton · 2019-11-15T01:20:26Z

Ok, then I think this should copy dumb_print into rustc to use it there, this still shouldn't be adding a function to libstd.

Aaron1011 · 2019-11-15T02:41:28Z

@alexcrichton: dumb_print calls panic_output, which is private. We would need to expose that as well, since it's platform specific.

It seems much easier to expose a two-line perma-unstable function than to try to duplicate the necessary bits of libstd, or to deal with a unecessarily suboptimal print function.

Mark-Simulacrum · 2019-11-15T02:51:23Z

It seems likely that what we may want here is an unstable try_lock on std::io::{Stderr, Stdout} which we could then use. Presumably that can be FCP'd through libs team and is in general a much more general thing -- i.e., more reasonable to add to the libraries.

alexcrichton · 2019-11-15T16:51:46Z

I understand it's easier to expose unstable functionality from libstd than to copy it into the compiler. That being said it is not a free cost because then the libs team needs to maintain it, field questions about its stability, etc. My personal preference is pretty firm, this feature for rustc should not at this time result in unstable APIs being added to libstd.

Mark-Simulacrum · 2019-11-15T16:58:36Z

(To be clear, I was musing on the general topic of "I want to print to stdout/stderr in a non-deadlocking manner" -- I don't think that addition should happen as part of this work.)

I suspect that the best thing to do here might just be to close this; it's clear that it's a pretty hard problem and it's not super clear on exactly what we want to do (I still personally think that deadlocking might be more or less fine, and rather infrequent -- things holding stdout/stderr locks in the compiler is pretty rare, and println! and eprintln! I think for the most part shouldn't allocate within the lock?).

bjorn3 · 2019-11-15T17:01:28Z

I think it is useful for normal users to have a backtrace on OOM too, instead of just reserving it for rustc.

alexcrichton · 2019-11-16T00:15:05Z

A feature request for the compiler to debug an error happening on CI is not the proper place to make a proposal for an official unstable API to be added to the standard library. If that's the way this is going then this should be closed and reopened where its main focus is adding new APIs to the standard library. Adding APIs shouldn't "just do it off-hand if rustc needs it"

Mark-Simulacrum · 2019-11-16T00:22:41Z

I'm going to close this as I don't think it's productive to keep it open. I believe that the next step, if we want this, is to open a new PR adding a try_lock API or so to Stderr/Stdout (unstable, of course). I don't personally care about this though especially as we've now figured out the underlying cause in this specific problem.

rust-highfive assigned cramertj Nov 12, 2019

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 12, 2019

Aaron1011 added 2 commits November 13, 2019 17:38

Print backtrace instead of panicking

5c97193

Aaron1011 force-pushed the feature/oom-backtrace branch from d3f302a to 5c97193 Compare November 13, 2019 22:56

rust-highfive assigned alexcrichton and unassigned cramertj Nov 13, 2019

Mark-Simulacrum closed this Nov 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trigger a panic on allocation error on nightly Rust #66347

Trigger a panic on allocation error on nightly Rust #66347

Aaron1011 commented Nov 12, 2019

rust-highfive commented Nov 12, 2019

Mark-Simulacrum commented Nov 12, 2019

Aaron1011 commented Nov 13, 2019

Mark-Simulacrum commented Nov 13, 2019

Mark-Simulacrum commented Nov 13, 2019

alexcrichton commented Nov 14, 2019

Aaron1011 commented Nov 14, 2019

Mark-Simulacrum commented Nov 14, 2019

alexcrichton commented Nov 14, 2019

bjorn3 commented Nov 14, 2019

Mark-Simulacrum commented Nov 14, 2019

Aaron1011 commented Nov 14, 2019

alexcrichton commented Nov 15, 2019

Aaron1011 commented Nov 15, 2019

Mark-Simulacrum commented Nov 15, 2019

alexcrichton commented Nov 15, 2019

Mark-Simulacrum commented Nov 15, 2019

bjorn3 commented Nov 15, 2019

alexcrichton commented Nov 16, 2019

Mark-Simulacrum commented Nov 16, 2019

Trigger a panic on allocation error on nightly Rust #66347

Trigger a panic on allocation error on nightly Rust #66347

Conversation

Aaron1011 commented Nov 12, 2019

rust-highfive commented Nov 12, 2019

Mark-Simulacrum commented Nov 12, 2019

Aaron1011 commented Nov 13, 2019

Mark-Simulacrum commented Nov 13, 2019

Mark-Simulacrum commented Nov 13, 2019

alexcrichton commented Nov 14, 2019

Aaron1011 commented Nov 14, 2019

Mark-Simulacrum commented Nov 14, 2019

alexcrichton commented Nov 14, 2019

bjorn3 commented Nov 14, 2019

Mark-Simulacrum commented Nov 14, 2019

Aaron1011 commented Nov 14, 2019

alexcrichton commented Nov 15, 2019

Aaron1011 commented Nov 15, 2019

Mark-Simulacrum commented Nov 15, 2019

alexcrichton commented Nov 15, 2019

Mark-Simulacrum commented Nov 15, 2019

bjorn3 commented Nov 15, 2019

alexcrichton commented Nov 16, 2019

Mark-Simulacrum commented Nov 16, 2019