Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace context.WithCancel with WithCancelCause #4457

Merged
merged 2 commits into from
Dec 12, 2023

Conversation

tonistiigi
Copy link
Member

@tonistiigi tonistiigi commented Dec 4, 2023

Start to bring some sanity to the "context canceled" errors without a stacktrace.

TODO:

  • Same for Deadline/Timeout see replace context.WithCancel with WithCancelCause #4457 (comment)
  • For cases that do defer cancel() we could have a helper that adds fallback handling for code paths (in vendor) that return ctx.Err() directly and add the stacktrace from the Cause() still to the error. let's do this in follow-up as not directly related

@@ -41,6 +41,8 @@ linters-settings:
forbid:
- '^fmt\.Errorf(# use errors\.Errorf instead)?$'
- '^logrus\.(Trace|Debug|Info|Warn|Warning|Error|Fatal)(f|ln)?(# use bklog\.G or bklog\.L instead of logrus directly)?$'
- '^context\.WithCancel(# use context\.WithCancelCause instead)?$'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Should we require context.WithTimeoutCause and context.WithDeadlineCause in this PR also? (I don't think we use WithDeadline anywhere, but should probably forbid it for the future)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, marked in TODO comment

Copy link
Member Author

@tonistiigi tonistiigi Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm not sure if anything can be done for this. Looks like WithTimeoutCause/WithDeadlineCause have a completely different signature and cause is added as a parameter instead of returning func(error). This isn't really useful as it is basically just doing ctx.WithValue("cause", cause) and doesn't detect when the cancellation actually happened.

Note that if context actually reached timeout then this is somewhat expected as it will happen somewhere in stdlib but if CancelFunc gets called then with this stdlib function it does not allow detecting the location.

Maybe for the defer() case still something can be done about this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess maybe something like this could work?

cause := &timeoutCause{}
ctx, cancel := context.WithTimeoutCause(ctx, ..., cause.Init())
cancelErr = cause.WithCancel(cancel)


cancelErr(errors.WithStack(context.DeadlineExceeded))


type timeoutCause {
  error
}

func (t *timeoutCause) Init() {
  // todo: atomic
  // store the context creation stack. this will be shown when timeout is reached and runtime cancels
  t.error = errors.WithStack(context.DeadlineExceeded)
}

func (t *timeoutCause) WithCancel(c func()) func(error) {
  return func(e error) {
    // todo: atomic
    // store the context cancellation stack. this will be shown when cancel is called manually
    t.error = errors.WithStack(o)
  }
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the go issue golang/go#56661 there is also another workaround mentioned.

ctx, cancel := context.WithCancelCause(context.Background())
ctx, _ = context.WithTimeoutCause(ctx, 1time.Second, tooSlow)

what is simpler but I'm afraid this will cause a linter error for leaking context that needs to be disabled all the time. Really don't understand why WithTimeoutCause just does not return CancelCauseFunc.

Copy link
Collaborator

@coryb coryb Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm for WithTimeoutCause I use something like:

context.WithTimeoutCause(ctx, time.Minute*5, errors.WithStack(context.DeadlineExceeded))

So we can get the stack for which specific timeout what triggered.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will give you the stacktrace to place that created the context. But for example if you have:

{
ctx, cancel := context.WithTimeoutCause(ctx, time.Minute*5, errors.WithStack(context.DeadlineExceeded))
defer cancel()

go func() {
  // leaky goroutine
  err := foo(ctx)
  // cancellation error because context was discarded but no stacktrace to where it happened
}()

return nil
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, yeah, maybe add a helper function to wrap timeouts, and add that to the forbidigo rule?

func withTimeout(ctx context.Context, d time.Duration) (context.Context, func(error)) {
    ctx, cancel := context.WithCancelCause(ctx)
    ctx, _ = context.WithTimeoutCause(ctx, d, errors.WithStack(context.DeadlineExceeded))
    return ctx, cancel
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I turns out the linter only has this leak detection for context.WithTimeout but nobody has yet added a rule for context.WithTimeoutCause yet. So this is a problem for someone in the future trying to update the linter 😉

err = ctx.Err()
// Cause can't be used here because this error is returned for Err() in custom context
// implementation and unfortunately stdlib does not allow defining Cause() for custom contexts
err = ctx.Err() //nolint: forbidigo
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is no way to define Cause() atm for custom implementations of context interface so this needs to return Err() directly. Maybe this can be improved in the future if we could avoid defining Err() and instead would use embedded stdlib context. But I think we can look at this case separately.

@tonistiigi tonistiigi marked this pull request as ready for review December 5, 2023 18:18
Copy link
Collaborator

@coryb coryb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

solver/internal/pipe/pipe_test.go Outdated Show resolved Hide resolved
Keep stack traces for cancellation errors where possible.

Signed-off-by: Tonis Tiigi <[email protected]>
@tonistiigi tonistiigi merged commit 7eb2c8e into moby:master Dec 12, 2023
60 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants