Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ECR] [request]: support cache manifest #876

Closed
lifeofguenter opened this issue May 5, 2020 · 132 comments
Closed

[ECR] [request]: support cache manifest #876

lifeofguenter opened this issue May 5, 2020 · 132 comments
Assignees
Labels
Coming Soon ECR Amazon Elastic Container Registry OSS Open-source software Proposed Community submitted issue

Comments

@lifeofguenter
Copy link

lifeofguenter commented May 5, 2020

Would be great if ECR could support cache-manifest (see: https://medium.com/titansoft-engineering/docker-build-cache-sharing-on-multi-hosts-with-buildkit-and-buildx-eb8f7005918e)

NOTE FROM AWS: We shipped this on BuildKit 0.12, see here for details - https://aws.amazon.com/blogs/containers/announcing-remote-cache-support-in-amazon-ecr-for-buildkit-clients/. We are keeping this issue open for the time being to allow the community to discuss and gather further feedback

@TBBle
Copy link

TBBle commented Oct 27, 2020

BuildKit 0.8 will default to using an OCI media type for its caches (see moby/buildkit#1746) which I assume should make this work, but I haven't tested it myself.

@aleks-fofanov
Copy link

It still doesn't work with recently released buildkit 0.8.0
It can write the layers and config, but it is unable to upload manifest to ECR:

=> ERROR exporting cache                                                                                                     5.4s
 => => preparing build cache for export                                                                                       0.2s
 => => writing layer sha256:0d48cc65d93fe2ee9877959ff98ebc98b95fe4b2fc467ff50f27103c1c5d6973                                  0.3s
 => => writing layer sha256:2ade286d53f2e045413601ca0e3790de3792ea34abd3d025cd2cd9c3cb5231de                                  0.3s
 => => writing layer sha256:64befcf53942ba04c144cde468548885d497e238001e965e983e39eb947860c2                                  0.3s
 => => writing layer sha256:7415f0cbea8739c1bf353568b16ac74a9cfbc0b36327602e3a025abf919a38a6                                  0.3s
 => => writing layer sha256:76a1f73c618c30eb1b1d90cf043fe3f855a1cce922d1fb47458defd3dbe1c783                                  0.3s
 => => writing layer sha256:8674739c0ada3e834b816667d26dd185aa5ea089f33701f11a05b7be03f43026                                  0.3s
 => => writing layer sha256:9dc80bcd2805b2a441bd69bc9468df2e81994239e34879567bed7bdef6cb605d                                  0.3s
 => => writing layer sha256:cbdbe7a5bc2a134ca8ec91be58565ec07d037386d1f1d8385412d224deafca08                                  0.3s
 => => writing layer sha256:ce4e6de84945ab498f65d16920c9b801dfea3792871e44f89e6438e232a690b3                                  0.3s
 => => writing layer sha256:d46583c5d4c69b34cb46866838d68f53a38686dc7f2d1347ae0f252e8eb0ed4c                                  0.2s
 => => writing config sha256:33c76a0f8a74a06e461926d8a8d1845371c0cf9e86753db2483a4873aede8889                                 2.0s
 => => writing manifest sha256:0f69a7e6626f6a24a0a95ed915613ebdf9459280d4986879480d87e34849aea8                               0.6s
------
 > importing cache manifest from XXXXXXXXXXXX.dkr.ecr.us-west-2.amazonaws.com/test-repo:buildcache:
------
------
 > exporting cache:
------
error: failed to solve: rpc error: code = Unknown desc = error writing manifest blob: failed commit on ref "sha256:0f69a7e6626f6a24a0a95ed915613ebdf9459280d4986879480d87e34849aea8": unexpected status: 400 Bad Request

@errm
Copy link

errm commented Dec 9, 2020

I am seeing the same error on buildkit 0.8.0

even when setting oci-mediatypes explicitly to true: --export-cache type=registry,ref=${REPO}:buildcache,oci-mediatypes=true

 => ERROR exporting cache                                                                                                                                                                                                                                                                                                                                                                                                                                                                1.4s
 => => preparing build cache for export                                                                                                                                                                                                                                                                                                                                                                                                                                                  0.0s
 => => writing layer sha256:757d39990544d20fbebf7a88e29a5dd2bb6a4fdb116d67df9fe8056843da794d                                                                                                                                                                                                                                                                                                                                                                                             0.1s
 => => writing layer sha256:7597eaba0060104f2bd4f3c46f0050fcf6df83066870767af41c2d7696bb33b2                                                                                                                                                                                                                                                                                                                                                                                             0.1s
 => => writing config sha256:0e308fd4eee4cae672eee133cbd77ef7c197fa5d587110b59350a99b289f7000                                                                                                                                                                                                                                                                                                                                                                                            0.8s
 => => writing manifest sha256:8eb142b16e0ec25db4517f2aecff795cca2b1adbe07c32f5c571efc5c808cbcd                                                                                                                                                                                                                                                                                                                                                                                          0.3s
------
 > importing cache manifest from xxx.dkr.ecr.us-east-1.amazonaws.com/errm/test:buildcache:
------
------
 > exporting cache:
------
error: failed to solve: rpc error: code = Unknown desc = error writing manifest blob: failed commit on ref "sha256:8eb142b16e0ec25db4517f2aecff795cca2b1adbe07c32f5c571efc5c808cbcd": unexpected status: 400 Bad Request

Deamon logs:

time="2020-12-09T13:42:48Z" level=info msg="running server on /run/buildkit/buildkitd.sock"
time="2020-12-09T13:44:09Z" level=warning msg="reference for unknown type: application/vnd.buildkit.cacheconfig.v0"
time="2020-12-09T13:44:10Z" level=error msg="/moby.buildkit.v1.Control/Solve returned error: rpc error: code = Unknown desc = error writing manifest blob: failed commit on ref \"sha256:8eb142b16e0ec25db4517f2aecff795cca2b1adbe07c32f5c571efc5c808cbcd\": unexpected status: 400 Bad Request\n"

@AlexLast
Copy link

AlexLast commented Jan 4, 2021

Also seeing this for private repos, although it doesn't seem to be an issue with public ECR repos..

@n1ru4l
Copy link

n1ru4l commented Feb 8, 2021

Is there a timeframe for this feature request available? This could help to tremendously speed up CI builds.

@jellevanhees
Copy link

We have been experimenting with this buildkit feature for some time now and it works wonders.
currently, we are still dependant upon dockehub so having this functionality in private ecr would greatly benefit our ci/cd workflow

@davidfm
Copy link

davidfm commented Mar 16, 2021

Any indication as to if/when this will ever be available? Using buildkit would really improve our CI build times

@devopsmash
Copy link

One year passed and still nothing 😔

@pieterza
Copy link

pieterza commented Jun 1, 2021

We'd really like to see support of this with ECR private repos 🙏

As of today, it still does not work:

error: failed to solve: rpc error: code = Unknown desc = error writing manifest blob: failed commit on ref "sha256:75f32e1bb4df7c6333dc352ea3ea9d04d1e04e4a14ba79b59daa019074166519": unexpected status: 400 Bad Request

@hf
Copy link

hf commented Jun 13, 2021

Yes please!

@abatilo
Copy link

abatilo commented Aug 21, 2021

Can we get any kind of communications on this?

@renannprado
Copy link

Is there any workaround available?

@ynouri
Copy link

ynouri commented Nov 2, 2021

For the teams using Github but wishing to keep images in ECR, it is possible to leverage the cache manifest support from Github Container Registry (GHCR) and push the image to ECR at the same time. When pushing to ECR, only new layers get pushed.

Github Actions workflow example:

jobs:

  docker_build:
    strategy:
      matrix:
        name:
          - my-image
        include:
          - name: my-image
            registry_ecr: my-aws-account-id.dkr.ecr.us-east-1.amazonaws.com
            registry_ghcr: ghcr.io/my-github-org-name
            dockerfile: ./path/to/Dockerfile
            context: .
            extra_args: ''

    steps:
      - uses: actions/checkout@v2

      - name: Install Buildkit
        uses: docker/setup-buildx-action@v1
        id: buildx
        with:
          install: true

      - name: Login to GitHub Container Registry
        uses: docker/login-action@v1
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v1
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: us-east-1
          role-skip-session-tagging: true
          role-duration-seconds: 1800
          role-session-name: GithubActionsBuildDockerImages

      - name: Login to Amazon ECR
        uses: aws-actions/amazon-ecr-login@v1

      - name: Build & Push (ECR)
        # - https://docs.docker.com/engine/reference/commandline/buildx_build/
        # - https:/moby/buildkit#export-cache
        run: |
          docker buildx build \
            --cache-from=type=registry,ref=${{ matrix.registry_ghcr }}/${{ matrix.name }}:cache \
            --cache-to=type=registry,ref=${{ matrix.registry_ghcr }}/${{ matrix.name }}:cache,mode=max \
            --push \
            ${{ matrix.extra_args }} \
            -f ${{ matrix.dockerfile }} \
            -t ${{ matrix.registry_ecr }}/${{ matrix.name }}:${{ github.sha }} \
            ${{ matrix.context }}

@abatilo
Copy link

abatilo commented Nov 3, 2021

@ynouri Just be careful of your storage costs in GHCR. It's oddly expensive. I found https:/snok/container-retention-policy to help solve that use case for me.

@kgns
Copy link

kgns commented Nov 3, 2021

ECR still not supporting this is unbelievably amateurish, it doesn't suit AWS...

@pieterza
Copy link

pieterza commented Nov 5, 2021

Is there any workaround available?

Use another Docker registry. Dockerhub, or perhaps your own tiny EC2 with some fat storage.
Sucks, but AWS doesn't seem interested.

@poldridge
Copy link

This seems to have started working unannounced, at least when using docker 20.10.11 to build

@ramosbugs
Copy link

This seems to have started working unannounced, at least when using docker 20.10.11 to build

I'm still seeing error writing manifest blob with 400 Bad Request on Docker 5:20.10.12~3-0~ubuntu-focal, at least in us-west-2.

@kgns
Copy link

kgns commented Dec 19, 2021

This seems to have started working unannounced, at least when using docker 20.10.11 to build

is this confirmed?

@BeyondEvil
Copy link

This seems to have started working unannounced, at least when using docker 20.10.11 to build

is this confirmed?

I'm wondering the same thing.

Could you share some more info @poldridge ?

@eduard-malakhov
Copy link

I've just faced the same issue with Docker version 20.10.12, build e91ed57. Would appreciate any hints or workarounds.

@sherifabdlnaby
Copy link

This seems to have started working unannounced, at least when using docker 20.10.11 to build

Did not work for me using docker:20.10.11-dind and ECR us-west-2.

@sherifabdlnaby
Copy link

Can we get any kind of communication on this? Being able to use remote cache is a major benefit to all our build pipelines.

@kattmang
Copy link

kattmang commented Aug 15, 2023

I've been also following this issue for a while and it is great to see it works now. Thanks everyone! But I got another question in my setup, the cache is pushed to ECR finally, but it turns out it spent 4mins to export the cache for a 400M docker image.. which increase the CI time rather than saving any time. Even for the second time running the same docker build, it still spent 2ish mins to export the cache. Anyone got the same situation? Thanks,

If your subsequent build is 2minutes to export the cache, unscientifically that could suggest that half of your cache is invalid/has new data build-over-build: I would double-check if your dockerfile is ordering instructions in a well-optimized way: https://docs.docker.com/build/cache/#how-can-i-use-the-cache-efficiently . Failing that, earthly's docs has a great passage on the types of image where remote caches (or caches in general) may not help you:

Making use of explicit caching effectively may not always be possible. Sometimes the overhead of uploading and redownloading the cache defeats the purpose of gaining build performance. Oftentimes, multiple iterations of trial-and-error need to be attempted in order to optimize its effectiveness. Keep in mind that caching compute-heavy targets is more likely to yield results, rather than download-heavy targets.

@kattmang
Copy link

kattmang commented Aug 15, 2023

hey @robg-eb : sorry for the delayed response, I was looking into a colleague experiencing the same issue; their problem was related to the type of image they were pushing, which was an image without intermediate layers in their dockerfile. I would double check that the image you're pushing has more than two lines in its dockerfile (beyond a FROM and an ENTRYPOINT line).

If that's not the problem, we probably need to surface up the actual error message ECR gives back to docker when the push fails. To be more exact, docker, when using buildkit, depends on containerd to push images/layers/cache/cache blobs out to remote registries. I believe I have identified a problem with containerd swallowing ours and other OCI-compliant registries error messages, which I've raised here: containerd/containerd#8969

@robg-eb
Copy link

robg-eb commented Aug 15, 2023

@kattmang - Sure enough, adding another line to my Dockerfile did resolve the 405 Method Not Allowed error! I only had a FROM line , and of course, it makes sense that this would not be enough to create a cache. We'd love to now try to use this in our real workflow, but would love to see some official docs from AWS on using it before doing so - do you know when that's planned to be published?

@kattmang
Copy link

hey @robg-eb to put my customer hat on (I was one not too long ago), I wouldn't want to have to own a separate buildkit dependency myself in my CI/CD platform and rather have docker and docker engine own its own dependency tree in production.

That's why we're waiting to release any guidance until docker 25.0 which packages buildkit 0.12 as a direct/indirect dependency. This appears to be launching soon here. I've talked to some project folks there, and they don't have a definite timeline.

We do like folks trying out buildkit 0.12.x already though!

@BwL1289
Copy link

BwL1289 commented Aug 16, 2023

@kattmang I've reached out on twitter. @rafavallina unfortunately I can't DM you bc I'm not verified.

@rafavallina
Copy link

Hi @BwL1289 I followed you on Twitter. You can follow me back in case you need to also DM me :)

@BwL1289
Copy link

BwL1289 commented Aug 16, 2023

@rafavallina done! DMed you

@Slevy35
Copy link

Slevy35 commented Sep 3, 2023

I've been following this issue for a while and it is great to see this live. Thanks everyone!

I managed to have it working in GitHub Actions using the following, in case someone has a similar setup:

- name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
        with:
          version: latest
          driver-opts: image=moby/buildkit:latest

- name: Build Image
        uses: docker/build-push-action@v3
        env:
          ECR_REPOSITORY: ${{ ... }}
          ECR_REGISTRY: ${{ ... }}
        with:
          file: ci/Dockerfile
          tags: ${{ ... }}
          cache-from: type=registry,ref=${{ env.ECR_REGISTRY }}/${{ env.ECR_REPOSITORY }}:cache
          cache-to: mode=max,image-manifest=true,type=registry,ref=${{ env.ECR_REGISTRY }}/${{ env.ECR_REPOSITORY }}:cache
          push: true
          pull: true
          secrets: |
            "***"

i'm using a sidecar in k8s cluster that run builtkit rootless image, i'm using the same parameters that you provided but the push of the cache image is still failing

the command:

docker buildx build --push -t ${ECR}/${IMAGE}:${TAG} \
  --cache-from=type=registry,ref=${ECR}/${IMAGE}:cache \
  --cache-to=type=registry,ref=${ECR}/${IMAGE}:cache,oci-mediatypes=true,mode=max,ignore-error=true \
  --file ${DOCKERFILE_PATH} .

the output:

[2023-09-03T13:28:28.238Z] #19 exporting cache to registry
[2023-09-03T13:28:28.238Z] #19 preparing build cache for export
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:04d12edb9a059166eecc48b284c890571a8e20578426410c9ed2b423ecfb790e
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:04d12edb9a059166eecc48b284c890571a8e20578426410c9ed2b423ecfb790e 1.0s done
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:05c125d691a6dbcd57416f1cd229ea688b44fd3c107cf0c972335d0f29e9d118 0.1s done
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:1e12e28c9eab0295ed027be9d593c9ef0d65461e45af3c7f09c5b2dfd0557174
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:1e12e28c9eab0295ed027be9d593c9ef0d65461e45af3c7f09c5b2dfd0557174 0.3s done
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:22515cba19a47132a0de55b237d32719b554f56ba16e1cd3c4b0e89897cf96a6
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:22515cba19a47132a0de55b237d32719b554f56ba16e1cd3c4b0e89897cf96a6 1.8s done
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:3ff76a909edf10b2b9386f7d06101f87185dcbc84e041d62e681266a2205aef4
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:3ff76a909edf10b2b9386f7d06101f87185dcbc84e041d62e681266a2205aef4 0.0s done
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:42c48b4a3aa7a518f44d9bc62b01d2d98255fca89cdcecd6bc083dd9b7b9ca19
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:42c48b4a3aa7a518f44d9bc62b01d2d98255fca89cdcecd6bc083dd9b7b9ca19 1.7s done
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:44ba2882f8eb14264e5f2f9f6ec55bcf5306527b637279f2cd9d4858762388af 0.1s done
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:4517bdd299a1a03209668b96d5a1e33b04e1ef9bdbd5f9dd80310381b7d5c540
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:4517bdd299a1a03209668b96d5a1e33b04e1ef9bdbd5f9dd80310381b7d5c540 0.4s done
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1 0.4s done
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:7502f7bef1412d77238ab53dd9682851901cdd3192f85babd907a30ebfc44e55
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:7502f7bef1412d77238ab53dd9682851901cdd3192f85babd907a30ebfc44e55 0.0s done
[2023-09-03T13:29:35.885Z] #19 writing layer sha256:8302227edf612edde0fb074ac895c78ce42055887c5b237f43c82cbd52c80a98
[2023-09-03T13:29:42.434Z] #19 writing layer sha256:8302227edf612edde0fb074ac895c78ce42055887c5b237f43c82cbd52c80a98 7.4s done
[2023-09-03T13:29:42.434Z] #19 writing layer sha256:84e976e63c97cac21a616478564450d937a8a6c2a889a161e8c09dfddc178578
[2023-09-03T13:29:43.353Z] #19 writing layer sha256:84e976e63c97cac21a616478564450d937a8a6c2a889a161e8c09dfddc178578 1.3s done
[2023-09-03T13:29:43.353Z] #19 writing layer sha256:bacce16f9429ecb6d345f74b58f52bcca14ba612f8cb08990f70378b9cc562b8 0.0s done
[2023-09-03T13:29:43.353Z] #19 writing layer sha256:c3cb741fb556fa1f36a4f09399543828d596d1449e1350b2f6c7111dff0820bf 0.0s done
[2023-09-03T13:29:43.353Z] #19 writing layer sha256:c6973ff5b9dc9d79f4b4baf0d3fc978504ef56982ced5c18fa51e33f6679db4b
[2023-09-03T13:29:43.909Z] #19 writing layer sha256:c6973ff5b9dc9d79f4b4baf0d3fc978504ef56982ced5c18fa51e33f6679db4b 0.3s done
[2023-09-03T13:29:43.909Z] #19 writing layer sha256:d1b005ddbf5b9d26565261fbcf34d1cdc88a1af3482060f886f9916b7f1facde
[2023-09-03T13:30:15.906Z] #19 writing layer sha256:d1b005ddbf5b9d26565261fbcf34d1cdc88a1af3482060f886f9916b7f1facde 30.7s done
[2023-09-03T13:30:15.906Z] #19 writing layer sha256:ef5ffcd836544ebe233558539d5814eedc6264a5841f3b291173efcd5eafce0f 0.0s done
[2023-09-03T13:30:15.906Z] #19 writing layer sha256:f914c0658bc606114b1e591cba85cc9aec11e17b81c5d9f9ff946d6f15a98271 0.0s done
[2023-09-03T13:30:15.906Z] #19 writing config sha256:f44c7b2db091803c86e7fcab1b5ffd312654e6cdd75d84530f44a36fcf6ffb4b
[2023-09-03T13:30:15.906Z] #19 writing config sha256:f44c7b2db091803c86e7fcab1b5ffd312654e6cdd75d84530f44a36fcf6ffb4b 0.4s done
[2023-09-03T13:30:15.906Z] #19 writing cache manifest sha256:856d69b71534df97892a6b30ebd7c9f67b701bc61f131a045b66f0dcc92bdaf1
[2023-09-03T13:30:15.906Z] #19 preparing build cache for export 107.0s done
[2023-09-03T13:30:15.906Z] #19 writing cache manifest sha256:856d69b71534df97892a6b30ebd7c9f67b701bc61f131a045b66f0dcc92bdaf1 0.3s done
[2023-09-03T13:30:15.906Z] #19 ERROR: error writing manifest blob: failed commit on ref "sha256:856d69b71534df97892a6b30ebd7c9f67b701bc61f131a045b66f0dcc92bdaf1": unexpected status from PUT request to https://<ACCOUNT_ID>.dkr.ecr.us-east-1.amazonaws.com/v2/<IMAGE>/manifests/cache: 400 Bad Request
[2023-09-03T13:30:17.261Z] ------
[2023-09-03T13:30:17.261Z]  > importing cache manifest from <ACCOUNT_ID>.dkr.ecr.us-east-1.amazonaws.com/<IMAGE>:cache:
[2023-09-03T13:30:17.261Z] ------
[2023-09-03T13:30:17.261Z] ------
[2023-09-03T13:30:17.261Z]  > exporting cache to registry:
[2023-09-03T13:30:17.261Z] ------

@rperryng
Copy link

rperryng commented Sep 3, 2023

@Slevy35 you are missing the image-manifest=true argument in the --cache-to parameter. Also double check that tag immutability is disabled for your repository.

@tomiszili
Copy link

tomiszili commented Sep 4, 2023

Hello All,
I need some help with the ECR remote cache.
I got the following error ERROR: failed to configure registry cache importer: invalid reference format message within the CI. But in some later step the same image is cached.

#14 importing cache manifest from ***.dkr.ecr.us-west-2.amazonaws.com/pro-build-cache:my-123456-my-pro-build-cache-cache
#14 inferred cache manifest type: application/vnd.oci.image.manifest.v1+json done
#14 DONE 0.3s

Im using docker compose with cache_to and cache_from properties with the following values

CACHE_FROM="type=registry,ref=${DOCKER_REGISTRY}/pro-build-cache:my-123456-my-pro-build-cache-cache"
CACHE_TO="mode=max,image-manifest=true,oci-mediatypes=true,type=registry,ref=${DOCKER_REGISTRY}/pro-build-cache:my-123456-my-pro-build-cache-cache"

And i invoke the compose with the following command:

docker buildx create --use --driver docker-container --name test-builder1 --driver-opt image=moby/buildkit:latest
COMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 BUILDKIT_PROGRESS=plain \
          docker compose --profile=build build \
          --build-arg GIT_VERSION="${GIT_VERSION}"

How to resolve the first error?

@pinkavaj
Copy link

The Dockerfile bellow still returns the 405 Method Not Allowed error error unless the RUN line is uncommented. We are using such files to create dummies for tests and it would be annoying to add some dummy line into all those files. Also for developers it is a bit hard to decipher where the problem is.

Any possibly solution for this cornercase?

FROM ubuntu:22.04 AS xxx

ARG CI_PIPELINE_ID=not_set
ENV BUILD_BY_CI_PIPELINE_ID=$CI_PIPELINE_ID

#
# RUN echo dummy >f
CMD echo $CI_PIPELINE_ID

@TBBle
Copy link

TBBle commented Oct 12, 2023

It's worth checking what's actually being pushed in the no-layer case, as it's possible that BuildKit is generating a non-SHOULD-compliant image cache in this case, and could be fairly-simply fixed to not do so.

Per the OCI Image spec guidelines for Artifacts.

If the artifact does not need layers, a single layer SHOULD be included with a non-zero size. The suggested content for an unused layers array is the empty descriptor.

Of course, BuildKit may already be following this guidance, I haven't checked.

If BuildKit is already generating such a compliant image, it seems like it's entirely on AWS to fix ECR to accept such images, as this is surely not the only case where such an artifact would be relevant to users.

Edit:

I briefly checked, and it does not appear to ensure that the layers array is not empty, called from here, called from here.

Someone in the situation to actually check that my analysis of the BuildKit code is correct (i.e. dump the OCI-spec cache config for the breaking case and check that layers is empty), and then open a feature request on BuildKit.

Late passing thought: If you trivially create an image with no layers, does ECR reject that image too? In theory, it should have the same issue as the empty caches causing 405's here. (ECR might well be what motivated the "SHOULD" in the image-spec)

FROM scratch
ARG CI_PIPELINE_ID=not_set
ENV BUILD_BY_CI_PIPELINE_ID=$CI_PIPELINE_ID
CMD echo $CI_PIPELINE_ID

I went ahead and put up a draft PR to see what the BuildKit team thinks. moby/buildkit#4331 It'd be great to find a way to validate if that allows these almost-trivial Dockerfile images to push a cache to ECR.


Late edit: Discussing on the BuildKit issue suggests that as a simple solution, simply not trying to push a cache that contains no layers would be clearer, i.e. the equivalent of not using --cache-to in this case. I'm not sure I'll get a chance to prototype that though anytime soon.

@kattmang
Copy link

Thanks @TBBle . I agree with your latest edit here, FWIW. We should go this route :)

@TBBle
Copy link

TBBle commented Oct 17, 2023

Okay, that change has merged, so when BuildKit 0.13 is shipped, and various downstreams like buildx and Docker pick it up, the "450 error for cache with no layers" should no longer occur, as the push will not be attempted.

I don't have a specific timeline on that happening though.

whoan added a commit to whoan/docker-build-with-cache-action that referenced this issue Oct 22, 2023
AWS ECR needs image-manifest=true for cache images to be pushed.
See aws/containers-roadmap#876 for more info.
whoan added a commit to whoan/docker-build-with-cache-action that referenced this issue Oct 22, 2023
AWS ECR needs image-manifest=true for cache images to be pushed.
See aws/containers-roadmap#876 for more info.
@rafavallina
Copy link

rafavallina commented Oct 24, 2023

Ok, while we wait for Docker 25 to come out, we have published a blog post that explains how AWS supports cache manifest and the implementation: https://aws.amazon.com/blogs/containers/announcing-remote-cache-support-in-amazon-ecr-for-buildkit-clients/

I'm going to keep this issue open as the conversation is valuable and we can gather more feedback on it for the time being

@rafavallina
Copy link

Docker 25 is out! We're closing this issue at this point. Thanks all for the input and feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Coming Soon ECR Amazon Elastic Container Registry OSS Open-source software Proposed Community submitted issue
Projects
Status: Shipped
Development

No branches or pull requests