Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Registry push fails with "unexpected commit digest" #3856

Open
aaronlehmann opened this issue May 10, 2023 · 5 comments
Open

Registry push fails with "unexpected commit digest" #3856

aaronlehmann opened this issue May 10, 2023 · 5 comments

Comments

@aaronlehmann
Copy link
Collaborator

Seen after an upgrade to buildkit v0.11 (08941b1):

/moby.buildkit.v1.Control/Solve returned error: rpc error: code = Unknown desc = failed to push [registry/name/tag omitted]: failed commit on ref "layer-sha256:5fc8839f3cb47293c8e4505ab59e4da0e2133c77b2036cd90b21742d1ebd6b99": unexpected commit digest sha256:e58168ce66f4e17b1c89ff4584004deb04ad5b4e91a7f6cbacef8739b4a6de9a, expected sha256:5fc8839f3cb47293c8e4505ab59e4da0e2133c77b2036cd90b21742d1ebd6b99: failed precondition

The error is repeated for a few different refs being pushed simulatenously, with the same SHAs referenced.

It's followed by a stack trace including github.com/moby/buildkit/exporter/containerimage.(*imageExporterInstance).Export (the rest of the stack trace doesn't seem very interesting, and just seems to show the Solve hander).

I suspect this may be related to the use of MergeOp in the build graph, as we haven't seen this error on similar build systems where MergeOp isn't used.

cc @sipsma @coryb

@tonistiigi
Copy link
Member

Is it a multi-platform build or a build where the result could contain multiple identical layers?

@aaronlehmann
Copy link
Collaborator Author

Single platform-build. Not sure if the images could contain identical layers. The build graph pushes multiple images, which will share layers.

@sipsma
Copy link
Collaborator

sipsma commented May 22, 2023

I looked into this a bit more. I realized that the specific error message unexpected commit digest seems to only come from containerd's local content store implementation. That would suggest to me that the error is occurring while layers are being unlazied locally into order to push them to the registry, as opposed to during the push itself. It would make sense that this occurs when using MergeOp since MergeOp keeps all inputs lazy.

I didn't notice anything looking through the relevant code yet, but if you have any buildkitd logs from around the time these errors occurred @aaronlehmann, that could be super helpful. Specifically, any error logs and any debug logs for the http requests made to registries would be the most helpful, but as much as you can share would be ideal.

@tonistiigi
Copy link
Member

Specifically, any error logs and any debug logs

Also would be good to track down what these digests are: "sha256:e58168ce66f4e17b1c89ff4584004deb04ad5b4e91a7f6cbacef8739b4a6de9a, expected sha256:5fc8839f3cb47293c8e4505ab59e4da0e2133c77b2036cd90b21742d1ebd6b99" . I would hope one of them is a blob in the state directory but not sure which one is the correct one. If we find the layer then what image(or --cache-from) did it come from and how was that image used in the build. It is possible the same digest is somewhere in the progress logs as well, for example when it is doing cache import.

@aaronlehmann
Copy link
Collaborator Author

sha256:e58168ce66f4e17b1c89ff4584004deb04ad5b4e91a7f6cbacef8739b4a6de9a does not seem to exist on our registry.

sha256:5fc8839f3cb47293c8e4505ab59e4da0e2133c77b2036cd90b21742d1ebd6b99 does exist. The upload timestamp is 6 minutes before this error message.

This builder instance doesn't exist anymore, so all I have to go on is the logs, which are quite busy. Unfortunately, reproducing the issue is not straightforward, since it only showed up in production use, not in our automated tests.

e58168ce66f4e17b1c89ff4584004deb04ad5b4e91a7f6cbacef8739b4a6de9a only seems to appear in these error messages, while 5fc8839f3cb47293c8e4505ab59e4da0e2133c77b2036cd90b21742d1ebd6b99` has quite a few registry ops that reference it. I'm sharing logs referencing these with you privately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants