Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore the exit code of modprobe always #468

Merged
merged 3 commits into from
Jan 5, 2024

Conversation

tianon
Copy link
Member

@tianon tianon commented Dec 18, 2023

The nature of modprobe in this image is that it works via ip hacks, but the exit code will always be non-zero because we don't have /lib/modules from the host.

The effect of this was that everyone was using iptables-legacy (whether it was warranted for them to be doing so or not).

This probably fixes #466
This probably also fixes #467

if /usr/local/sbin/.iptables-legacy/iptables -nL > /dev/null 2>&1; then
# see https:/docker-library/docker/issues/463 (and the dind Dockerfile where this directory is set up)
export PATH="/usr/local/sbin/.iptables-legacy:$PATH"
fi
fi
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When host has both nf_tables and ip_tables but is using ip_tables, iptables is giving this warning:

iptables -nL >/dev/null
# Warning: iptables-legacy tables present, use iptables-legacy to see them

Something like this worked for me to detect that case and use iptables-legacy:

        elif iptables -nL 2>&1 >/dev/null | grep -q tables-legacy; then
                export PATH="/usr/local/sbin/.iptables-legacy:$PATH"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting! Any idea in what situations/use cases a fresh network namespace might have legacy tables set up in it already?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the code in iptables itself responsible for that check: 😄

https://git.netfilter.org/iptables/tree/iptables/nft-shared.c?id=f5cf76626d95d2c491a80288bccc160c53b44e88#n420

It's effectively "read from /proc/net/ip_tables_names and if it's non-empty, print the warning", so we can do better than grepping the output of the tool and can instead do a simple [ -s /proc/net/ip_tables_names ] (although we probably ought to do all three of those "legacy" files the upstream code references, for good measure).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting! Any idea in what situations/use cases a fresh network namespace might have legacy tables set up in it already?

Did some more testing and figured this out -- it turns out that on CentOS 7, /proc/net/ip_tables_names is not empty in a brand new network namespace, even though there aren't any actual rules. What's even funnier is that I was confused by that upstream code reading from the files and then throwing away everything it reads, but it turns out that's done because these special kernel files have contents, but if you stat them their size is 0 bytes. 🙃

8637e230c491:/# test -s /proc/net/ip_tables_names; echo $?
1
8637e230c491:/# stat /proc/net/ip_tables_names
  File: /proc/net/ip_tables_names
  Size: 0         	Blocks: 0          IO Block: 1024   regular empty file
Device: 29h/41d	Inode: 4026532469  Links: 1
Access: (0440/-r--r-----)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2024-01-04 18:43:59.112212543 +0000
Modify: 2024-01-04 18:43:59.112212543 +0000
Change: 2024-01-04 18:43:59.112212543 +0000
8637e230c491:/# cat /proc/net/ip_tables_names
nat
mangle
security
raw
filter

(so my -s test isn't sufficient 😞)

@steve-mt
Copy link

I was asked to test this on GitLab's infrastructure in #463 (comment) below are the test results which are successful 🎉 :

# Spin up server with our imae
$ gcloud compute instances create docker-test-cos-85 --image cos-85-13310-1498-7 --image-project cos-cloud --zone=us-east1-c

# Build image
$ gcloud compute ssh docker-test-cos-85
$ docker build --pull 'https:/docker-library/docker.git#refs/pull/468/merge:24/dind'

# Start Docker image
steve@docker-test-cos-85 ~ $ docker run --privileged --rm -it 10bb05416255
Certificate request self-signature ok
subject=CN = docker:dind server
/certs/server/cert.pem: OK
Certificate request self-signature ok
subject=CN = docker:dind client
/certs/client/cert.pem: OK
ip: can't find device 'nf_tables'
modprobe: can't change directory to '/lib/modules': No such file or directory
ip: can't find device 'ip_tables'
modprobe: can't change directory to '/lib/modules': No such file or directory
INFO[2023-12-19T10:25:43.763856164Z] Starting up
...
INFO[2023-12-19T10:25:43.850111407Z] containerd successfully booted in 0.054331s
INFO[2023-12-19T10:25:43.892042049Z] Loading containers: start.
INFO[2023-12-19T10:25:43.978291484Z] Loading containers: done.
INFO[2023-12-19T10:25:43.991579126Z] Docker daemon                                 commit=311b9ff graphdriver=overlay2 version=24.0.7
INFO[2023-12-19T10:25:43.992161922Z] Daemon has completed initialization
INFO[2023-12-19T10:25:44.036184473Z] API listen on /var/run/docker.sock
INFO[2023-12-19T10:25:44.036664068Z] API listen on [::]:2376
build logs
Sending build context to Docker daemon  14.34kB
Step 1/12 : FROM docker:24-cli
24-cli: Pulling from library/docker
661ff4d9561e: Pull complete
b94b6133cd82: Pull complete
4f4fb700ef54: Pull complete
de090124fc79: Pull complete
f04b53a848df: Pull complete
69c523561410: Pull complete
64a7d03e6f4c: Pull complete
ace0118efca4: Pull complete
80554da1d9db: Pull complete
Digest: sha256:1d835c53466737f96cc5a85e94a575acef9bdfb02da59dc21b82653fd4f1d1eb
Status: Downloaded newer image for docker:24-cli
 ---> cbdc7727a012
Step 2/12 : RUN set -eux;       apk add --no-cache              btrfs-progs             e2fsprogs               e2fsprogs-extra                 ip6tables               iptables                openssl                 shadow-uidmap           xfsprogs                xz              pigz    ;       if zfs="$(apk info --no-cache --quiet zfs)" && [ -n "$zfs" ]; then           apk add --no-cache zfs;         fi
 ---> Running in 3dc9d71aea25
+ apk add --no-cache btrfs-progs e2fsprogs e2fsprogs-extra ip6tables iptables openssl shadow-uidmap xfsprogs xz pigz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.19/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.19/community/x86_64/APKINDEX.tar.gz
(1/28) Installing libblkid (2.39.3-r0)
(2/28) Installing lzo (2.10-r5)
(3/28) Installing eudev-libs (3.2.14-r0)
(4/28) Installing libuuid (2.39.3-r0)
(5/28) Installing zstd-libs (1.5.5-r8)
(6/28) Installing btrfs-progs (6.6.2-r0)
(7/28) Installing libcom_err (1.47.0-r5)
(8/28) Installing e2fsprogs-libs (1.47.0-r5)
(9/28) Installing e2fsprogs (1.47.0-r5)
(10/28) Installing e2fsprogs-extra (1.47.0-r5)
(11/28) Installing libmnl (1.0.5-r2)
(12/28) Installing libnftnl (1.2.6-r0)
(13/28) Installing libxtables (1.8.10-r1)
(14/28) Installing iptables (1.8.10-r1)
(15/28) Installing openssl (3.1.4-r2)
(16/28) Installing pigz (2.8-r0)
(17/28) Installing libmd (1.1.0-r0)
(18/28) Installing libbsd (0.11.7-r3)
(19/28) Installing skalibs (2.14.0.1-r0)
(20/28) Installing utmps-libs (0.1.2.2-r0)
(21/28) Installing linux-pam (1.5.3-r7)
(22/28) Installing shadow-libs (4.14.2-r0)
(23/28) Installing shadow-subids (4.14.2-r0)
(24/28) Installing inih (57-r1)
(25/28) Installing userspace-rcu (0.14.0-r2)
(26/28) Installing xfsprogs (6.5.0-r0)
(27/28) Installing xz-libs (5.4.5-r0)
(28/28) Installing xz (5.4.5-r0)
Executing busybox-1.36.1-r15.trigger
OK: 24 MiB in 50 packages
+ apk info --no-cache --quiet zfs
+ zfs='zfs-2.2.2-r0 description:
Advanced filesystem and volume manager

zfs-2.2.2-r0 webpage:
https://openzfs.org

zfs-2.2.2-r0 installed size:
1396 KiB'
+ '[' -n 'zfs-2.2.2-r0 description:
Advanced filesystem and volume manager

zfs-2.2.2-r0 webpage:
https://openzfs.org

zfs-2.2.2-r0 installed size:
1396 KiB' ]
+ apk add --no-cache zfs
fetch https://dl-cdn.alpinelinux.org/alpine/v3.19/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.19/community/x86_64/APKINDEX.tar.gz
(1/9) Installing libintl (0.22.3-r0)
(2/9) Installing libtirpc-conf (1.3.4-r0)
(3/9) Installing krb5-conf (1.0-r2)
(4/9) Installing keyutils-libs (1.6.3-r3)
(5/9) Installing libverto (0.3.2-r2)
(6/9) Installing krb5-libs (1.21.2-r0)
(7/9) Installing libtirpc (1.3.4-r0)
(8/9) Installing zfs-libs (2.2.2-r0)
(9/9) Installing zfs (2.2.2-r0)
Executing busybox-1.36.1-r15.trigger
OK: 31 MiB in 59 packages
Removing intermediate container 3dc9d71aea25
 ---> aa06f320d2f5
Step 3/12 : RUN set -eux;       apk add --no-cache iptables-legacy;     mkdir -p /usr/local/sbin/.iptables-legacy;      for f in                iptables                iptables-save           iptables-restore                ip6tables               ip6tables-save          ip6tables-restore       ; do            b="/sbin/${f/tables/tables-legacy}";                 "$b" --version;                 ln -svT "$b" "/usr/local/sbin/.iptables-legacy/$f";     done;   export PATH="/usr/local/sbin/.iptables-legacy:$PATH";   iptables --version | grep legacy
 ---> Running in c3f3ccab0cbe
+ apk add --no-cache iptables-legacy
fetch https://dl-cdn.alpinelinux.org/alpine/v3.19/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.19/community/x86_64/APKINDEX.tar.gz
(1/3) Installing libip4tc (1.8.10-r1)
(2/3) Installing libip6tc (1.8.10-r1)
(3/3) Installing iptables-legacy (1.8.10-r1)
Executing busybox-1.36.1-r15.trigger
OK: 32 MiB in 62 packages
+ mkdir -p /usr/local/sbin/.iptables-legacy
+ b=/sbin/iptables-legacy
+ /sbin/iptables-legacy --version
iptables v1.8.10 (legacy)
+ ln -svT /sbin/iptables-legacy /usr/local/sbin/.iptables-legacy/iptables
'/usr/local/sbin/.iptables-legacy/iptables' -> '/sbin/iptables-legacy'
+ b=/sbin/iptables-legacy-save
+ /sbin/iptables-legacy-save --version
iptables-save v1.8.10 (legacy)
+ ln -svT /sbin/iptables-legacy-save /usr/local/sbin/.iptables-legacy/iptables-save
'/usr/local/sbin/.iptables-legacy/iptables-save' -> '/sbin/iptables-legacy-save'
+ b=/sbin/iptables-legacy-restore
+ /sbin/iptables-legacy-restore --version
iptables-restore v1.8.10 (legacy)
+ ln -svT /sbin/iptables-legacy-restore /usr/local/sbin/.iptables-legacy/iptables-restore
'/usr/local/sbin/.iptables-legacy/iptables-restore' -> '/sbin/iptables-legacy-restore'
+ b=/sbin/ip6tables-legacy
+ /sbin/ip6tables-legacy --version
ip6tables v1.8.10 (legacy)
+ ln -svT /sbin/ip6tables-legacy /usr/local/sbin/.iptables-legacy/ip6tables
+ b=/sbin/ip6tables-legacy-save
+ /sbin/ip6tables-legacy-save --version
'/usr/local/sbin/.iptables-legacy/ip6tables' -> '/sbin/ip6tables-legacy'
ip6tables-save v1.8.10 (legacy)
+ ln -svT /sbin/ip6tables-legacy-save /usr/local/sbin/.iptables-legacy/ip6tables-save
'/usr/local/sbin/.iptables-legacy/ip6tables-save' -> '/sbin/ip6tables-legacy-save'
+ b=/sbin/ip6tables-legacy-restore
+ /sbin/ip6tables-legacy-restore --version
ip6tables-restore v1.8.10 (legacy)
+ ln -svT /sbin/ip6tables-legacy-restore /usr/local/sbin/.iptables-legacy/ip6tables-restore
'/usr/local/sbin/.iptables-legacy/ip6tables-restore' -> '/sbin/ip6tables-legacy-restore'
+ export 'PATH=/usr/local/sbin/.iptables-legacy:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'
+ grep legacy
+ iptables --version
iptables v1.8.10 (legacy)
Removing intermediate container c3f3ccab0cbe
 ---> 85cc7d1d3221
Step 4/12 : RUN set -eux;       addgroup -S dockremap;  adduser -S -G dockremap dockremap;      echo 'dockremap:165536:65536' >> /etc/subuid;   echo 'dockremap:165536:65536' >> /etc/subgid
 ---> Running in 1403f02d15d5
+ addgroup -S dockremap
+ adduser -S -G dockremap dockremap
+ echo dockremap:165536:65536
+ echo dockremap:165536:65536
Removing intermediate container 1403f02d15d5
 ---> f459e96929b0
Step 5/12 : RUN set -eux;               apkArch="$(apk --print-arch)";  case "$apkArch" in              'x86_64')                       url='https://download.docker.com/linux/static/stable/x86_64/docker-24.0.7.tgz';                         ;;              'armhf')                        url='https://download.docker.com/linux/static/stable/armel/docker-24.0.7.tgz';                       ;;              'armv7')                        url='https://download.docker.com/linux/static/stable/armhf/docker-24.0.7.tgz';                  ;;              'aarch64')                      url='https://download.docker.com/linux/static/stable/aarch64/docker-24.0.7.tgz';                     ;;              *) echo >&2 "error: unsupported 'docker.tgz' architecture ($apkArch)"; exit 1 ;;        esac;           wget -O 'docker.tgz' "$url";            tar --extract           --file docker.tgz               --strip-components 1            --directory /usr/local/bin/          --no-same-owner                 --exclude 'docker/docker'       ;       rm docker.tgz;          dockerd --version;      containerd --version;   ctr --version;  runc --version
 ---> Running in ce56712e529b
+ apk --print-arch
+ apkArch=x86_64
+ url=https://download.docker.com/linux/static/stable/x86_64/docker-24.0.7.tgz
+ wget -O docker.tgz https://download.docker.com/linux/static/stable/x86_64/docker-24.0.7.tgz
Connecting to download.docker.com (3.161.193.14:443)
saving to 'docker.tgz'
docker.tgz            46% |**************                  | 30.9M  0:00:01 ETA
docker.tgz           100% |********************************| 66.5M  0:00:00 ETA
'docker.tgz' saved
+ tar --extract --file docker.tgz --strip-components 1 --directory /usr/local/bin/ --no-same-owner --exclude docker/docker
+ rm docker.tgz
+ dockerd --version
Docker version 24.0.7, build 311b9ff
+ containerd --version
containerd github.com/containerd/containerd v1.7.6 091922f03c2762540fd057fba91260237ff86acb
+ ctr --version
ctr github.com/containerd/containerd v1.7.6
+ runc --version
runc version 1.1.9
commit: v1.1.9-0-gccaecfc
spec: 1.0.2-dev
go: go1.20.10
libseccomp: 2.5.1
Removing intermediate container ce56712e529b
 ---> e88711b42fe2
Step 6/12 : ENV DIND_COMMIT 65cfcc28ab37cb75e1560e4b4738719c07c6618e
 ---> Running in 15f53b4abdb0
Removing intermediate container 15f53b4abdb0
 ---> fb163af82a2d
Step 7/12 : RUN set -eux;       wget -O /usr/local/bin/dind "https://raw.githubusercontent.com/docker/docker/${DIND_COMMIT}/hack/dind";         chmod +x /usr/local/bin/dind
 ---> Running in 1d3acb077218
+ wget -O /usr/local/bin/dind https://raw.githubusercontent.com/docker/docker/65cfcc28ab37cb75e1560e4b4738719c07c6618e/hack/dind
Connecting to raw.githubusercontent.com (185.199.111.133:443)
saving to '/usr/local/bin/dind'
dind                 100% |********************************|  2862  0:00:00 ETA
'/usr/local/bin/dind' saved
+ chmod +x /usr/local/bin/dind
Removing intermediate container 1d3acb077218
 ---> 197cda935b46
Step 8/12 : COPY dockerd-entrypoint.sh /usr/local/bin/
 ---> c0451e74e59b
Step 9/12 : VOLUME /var/lib/docker
 ---> Running in 75ea4536b7d4
Removing intermediate container 75ea4536b7d4
 ---> d2aee2b96022
Step 10/12 : EXPOSE 2375 2376
 ---> Running in 3c8d45fc03fd
Removing intermediate container 3c8d45fc03fd
 ---> 595b1de6da78
Step 11/12 : ENTRYPOINT ["dockerd-entrypoint.sh"]
 ---> Running in 2df28603eea9
Removing intermediate container 2df28603eea9
 ---> dc532f8177ca
Step 12/12 : CMD []
 ---> Running in 1f73f5f7eaa6
Removing intermediate container 1f73f5f7eaa6
 ---> 10bb05416255
Successfully built 10bb05416255
start up logs
Certificate request self-signature ok
subject=CN = docker:dind server
/certs/server/cert.pem: OK
Certificate request self-signature ok
subject=CN = docker:dind client
/certs/client/cert.pem: OK
ip: can't find device 'nf_tables'
modprobe: can't change directory to '/lib/modules': No such file or directory
ip: can't find device 'ip_tables'
modprobe: can't change directory to '/lib/modules': No such file or directory
INFO[2023-12-19T10:25:43.763856164Z] Starting up
WARN[2023-12-19T10:25:43.767709129Z] could not change group /var/run/docker.sock to docker: group docker not found
INFO[2023-12-19T10:25:43.768177577Z] containerd not running, starting managed containerd
INFO[2023-12-19T10:25:43.788860160Z] started new containerd process                address=/var/run/docker/containerd/containerd.sock module=libcontainerd pid=62
INFO[2023-12-19T10:25:43.796565663Z] starting containerd                           revision=091922f03c2762540fd057fba91260237ff86acb version=v1.7.6
INFO[2023-12-19T10:25:43.822393856Z] loading plugin "io.containerd.snapshotter.v1.aufs"...  type=io.containerd.snapshotter.v1
INFO[2023-12-19T10:25:43.827147221Z] skip loading plugin "io.containerd.snapshotter.v1.aufs"...  error="aufs is not supported (modprobe aufs failed: exit status 1 \"ip: can't find device 'aufs'\\nmodprobe: can't change directory to '/lib/modules': No such file or directory\\n\"): skip plugin" type=io.containerd.snapshotter.v1
INFO[2023-12-19T10:25:43.827587958Z] loading plugin "io.containerd.content.v1.content"...  type=io.containerd.content.v1
INFO[2023-12-19T10:25:43.827946063Z] loading plugin "io.containerd.snapshotter.v1.blockfile"...  type=io.containerd.snapshotter.v1
INFO[2023-12-19T10:25:43.828268888Z] skip loading plugin "io.containerd.snapshotter.v1.blockfile"...  error="no scratch file generator: skip plugin" type=io.containerd.snapshotter.v1
INFO[2023-12-19T10:25:43.828486950Z] loading plugin "io.containerd.snapshotter.v1.native"...  type=io.containerd.snapshotter.v1
INFO[2023-12-19T10:25:43.828765624Z] loading plugin "io.containerd.snapshotter.v1.overlayfs"...  type=io.containerd.snapshotter.v1
INFO[2023-12-19T10:25:43.829292901Z] loading plugin "io.containerd.snapshotter.v1.devmapper"...  type=io.containerd.snapshotter.v1
WARN[2023-12-19T10:25:43.829552796Z] failed to load plugin io.containerd.snapshotter.v1.devmapper  error="devmapper not configured"
INFO[2023-12-19T10:25:43.829752351Z] loading plugin "io.containerd.snapshotter.v1.zfs"...  type=io.containerd.snapshotter.v1
INFO[2023-12-19T10:25:43.830185103Z] skip loading plugin "io.containerd.snapshotter.v1.zfs"...  error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
INFO[2023-12-19T10:25:43.830413003Z] loading plugin "io.containerd.metadata.v1.bolt"...  type=io.containerd.metadata.v1
WARN[2023-12-19T10:25:43.830672780Z] could not use snapshotter devmapper in metadata plugin  error="devmapper not configured"
INFO[2023-12-19T10:25:43.830878614Z] metadata content store policy set             policy=shared
INFO[2023-12-19T10:25:43.841062065Z] loading plugin "io.containerd.differ.v1.walking"...  type=io.containerd.differ.v1
INFO[2023-12-19T10:25:43.841462101Z] loading plugin "io.containerd.event.v1.exchange"...  type=io.containerd.event.v1
INFO[2023-12-19T10:25:43.841688874Z] loading plugin "io.containerd.gc.v1.scheduler"...  type=io.containerd.gc.v1
INFO[2023-12-19T10:25:43.841994584Z] loading plugin "io.containerd.lease.v1.manager"...  type=io.containerd.lease.v1
INFO[2023-12-19T10:25:43.842232622Z] loading plugin "io.containerd.nri.v1.nri"...  type=io.containerd.nri.v1
INFO[2023-12-19T10:25:43.842390326Z] NRI interface is disabled by configuration.
INFO[2023-12-19T10:25:43.842548877Z] loading plugin "io.containerd.runtime.v2.task"...  type=io.containerd.runtime.v2
INFO[2023-12-19T10:25:43.843077443Z] loading plugin "io.containerd.runtime.v2.shim"...  type=io.containerd.runtime.v2
INFO[2023-12-19T10:25:43.843310782Z] loading plugin "io.containerd.sandbox.store.v1.local"...  type=io.containerd.sandbox.store.v1
INFO[2023-12-19T10:25:43.843494442Z] loading plugin "io.containerd.sandbox.controller.v1.local"...  type=io.containerd.sandbox.controller.v1
INFO[2023-12-19T10:25:43.843721326Z] loading plugin "io.containerd.streaming.v1.manager"...  type=io.containerd.streaming.v1
INFO[2023-12-19T10:25:43.843896820Z] loading plugin "io.containerd.service.v1.introspection-service"...  type=io.containerd.service.v1
INFO[2023-12-19T10:25:43.844076186Z] loading plugin "io.containerd.service.v1.containers-service"...  type=io.containerd.service.v1
INFO[2023-12-19T10:25:43.844259092Z] loading plugin "io.containerd.service.v1.content-service"...  type=io.containerd.service.v1
INFO[2023-12-19T10:25:43.844439127Z] loading plugin "io.containerd.service.v1.diff-service"...  type=io.containerd.service.v1
INFO[2023-12-19T10:25:43.844613034Z] loading plugin "io.containerd.service.v1.images-service"...  type=io.containerd.service.v1
INFO[2023-12-19T10:25:43.844806103Z] loading plugin "io.containerd.service.v1.namespaces-service"...  type=io.containerd.service.v1
INFO[2023-12-19T10:25:43.844976196Z] loading plugin "io.containerd.service.v1.snapshots-service"...  type=io.containerd.service.v1
INFO[2023-12-19T10:25:43.845077128Z] loading plugin "io.containerd.runtime.v1.linux"...  type=io.containerd.runtime.v1
INFO[2023-12-19T10:25:43.845544661Z] loading plugin "io.containerd.monitor.v1.cgroups"...  type=io.containerd.monitor.v1
INFO[2023-12-19T10:25:43.846199867Z] loading plugin "io.containerd.service.v1.tasks-service"...  type=io.containerd.service.v1
INFO[2023-12-19T10:25:43.846427968Z] loading plugin "io.containerd.grpc.v1.introspection"...  type=io.containerd.grpc.v1
INFO[2023-12-19T10:25:43.846670199Z] loading plugin "io.containerd.transfer.v1.local"...  type=io.containerd.transfer.v1
INFO[2023-12-19T10:25:43.846902808Z] loading plugin "io.containerd.internal.v1.restart"...  type=io.containerd.internal.v1
INFO[2023-12-19T10:25:43.847248050Z] loading plugin "io.containerd.grpc.v1.containers"...  type=io.containerd.grpc.v1
INFO[2023-12-19T10:25:43.847306089Z] loading plugin "io.containerd.grpc.v1.content"...  type=io.containerd.grpc.v1
INFO[2023-12-19T10:25:43.847328827Z] loading plugin "io.containerd.grpc.v1.diff"...  type=io.containerd.grpc.v1
INFO[2023-12-19T10:25:43.847351193Z] loading plugin "io.containerd.grpc.v1.events"...  type=io.containerd.grpc.v1
INFO[2023-12-19T10:25:43.847430123Z] loading plugin "io.containerd.grpc.v1.healthcheck"...  type=io.containerd.grpc.v1
INFO[2023-12-19T10:25:43.847463689Z] loading plugin "io.containerd.grpc.v1.images"...  type=io.containerd.grpc.v1
INFO[2023-12-19T10:25:43.847583614Z] loading plugin "io.containerd.grpc.v1.leases"...  type=io.containerd.grpc.v1
INFO[2023-12-19T10:25:43.847615842Z] loading plugin "io.containerd.grpc.v1.namespaces"...  type=io.containerd.grpc.v1
INFO[2023-12-19T10:25:43.847653324Z] loading plugin "io.containerd.internal.v1.opt"...  type=io.containerd.internal.v1
INFO[2023-12-19T10:25:43.848293139Z] loading plugin "io.containerd.grpc.v1.sandbox-controllers"...  type=io.containerd.grpc.v1
INFO[2023-12-19T10:25:43.848441887Z] loading plugin "io.containerd.grpc.v1.sandboxes"...  type=io.containerd.grpc.v1
INFO[2023-12-19T10:25:43.848463076Z] loading plugin "io.containerd.grpc.v1.snapshots"...  type=io.containerd.grpc.v1
INFO[2023-12-19T10:25:43.848490428Z] loading plugin "io.containerd.grpc.v1.streaming"...  type=io.containerd.grpc.v1
INFO[2023-12-19T10:25:43.848569045Z] loading plugin "io.containerd.grpc.v1.tasks"...  type=io.containerd.grpc.v1
INFO[2023-12-19T10:25:43.848594547Z] loading plugin "io.containerd.grpc.v1.transfer"...  type=io.containerd.grpc.v1
INFO[2023-12-19T10:25:43.848742462Z] loading plugin "io.containerd.grpc.v1.version"...  type=io.containerd.grpc.v1
INFO[2023-12-19T10:25:43.848765822Z] loading plugin "io.containerd.tracing.processor.v1.otlp"...  type=io.containerd.tracing.processor.v1
INFO[2023-12-19T10:25:43.848789341Z] skip loading plugin "io.containerd.tracing.processor.v1.otlp"...  error="no OpenTelemetry endpoint: skip plugin" type=io.containerd.tracing.processor.v1
INFO[2023-12-19T10:25:43.848815341Z] loading plugin "io.containerd.internal.v1.tracing"...  type=io.containerd.internal.v1
INFO[2023-12-19T10:25:43.848829451Z] skipping tracing processor initialization (no tracing plugin)  error="no OpenTelemetry endpoint: skip plugin"
INFO[2023-12-19T10:25:43.849402096Z] serving...                                    address=/var/run/docker/containerd/containerd-debug.sock
INFO[2023-12-19T10:25:43.849843536Z] serving...                                    address=/var/run/docker/containerd/containerd.sock.ttrpc
INFO[2023-12-19T10:25:43.850039914Z] serving...                                    address=/var/run/docker/containerd/containerd.sock
INFO[2023-12-19T10:25:43.850111407Z] containerd successfully booted in 0.054331s
INFO[2023-12-19T10:25:43.892042049Z] Loading containers: start.
INFO[2023-12-19T10:25:43.978291484Z] Loading containers: done.
INFO[2023-12-19T10:25:43.991579126Z] Docker daemon                                 commit=311b9ff graphdriver=overlay2 version=24.0.7
INFO[2023-12-19T10:25:43.992161922Z] Daemon has completed initialization
INFO[2023-12-19T10:25:44.036184473Z] API listen on /var/run/docker.sock
INFO[2023-12-19T10:25:44.036664068Z] API listen on [::]:2376

@tianon
Copy link
Member Author

tianon commented Dec 19, 2023

Extremely appreciated, @stevexuereb 🙇
(it's always stressful when we break lots of people at once 😅)

@tianon
Copy link
Member Author

tianon commented Dec 19, 2023

I thought about adding the following as well, but discussing with @ag-TJNII in #466 came up with the idea of an explicit configuration flag for getting legacy instead, since checking loaded modules is really fragile (what if the module was loaded a while ago and never used? what if one module is built-in in the kernel config, but the other isn't? etc etc etc)

diff --git a/dockerd-entrypoint.sh b/dockerd-entrypoint.sh
index e610cca..0d3a581 100755
--- a/dockerd-entrypoint.sh
+++ b/dockerd-entrypoint.sh
@@ -156,6 +156,10 @@ if [ "$1" = 'dockerd' ]; then
 		# https://git.netfilter.org/iptables/tree/iptables/nft-shared.c?id=f5cf76626d95d2c491a80288bccc160c53b44e88#n420
 		# if we already have any "legacy" iptables rules, we should always use legacy (https:/docker-library/docker/pull/468#discussion_r1430804593)
 		iptablesLegacy=1
+	elif grep -qE '^ip_tables ' /proc/modules && ! grep -qE '^nf_tables ' /proc/modules; then
+		# if the "ip_tables" module is loaded but the "nf_tables" module is not, we should probably use legacy (to match the host)
+		# in theory, this helps with broken implementations like CentOS 7 which *has* the "nf_tables" module but it appears to not work properly inside a network namespace (https:/docker-library/docker/issues/466)
+		iptablesLegacy=1
 	elif ! iptables -nL > /dev/null 2>&1; then
 		# if iptables fails to run, chances are high the necessary kernel modules aren't loaded (perhaps the host is using xtables, for example)
 		# https:/docker-library/docker/issues/350

@tianon
Copy link
Member Author

tianon commented Dec 19, 2023

Since the current implementation amounts to effectively "always use legacy" (matching the previous Alpine 3.18 image behavior), I also think it might be prudent at this point in December to wait until the new year to move further on this (in the interest of being respectful to folks' holidays if we do manage to regress again with this).

Comment on lines +147 to +149
if [ -n "${DOCKER_IPTABLES_LEGACY+x}" ]; then
# let users choose explicitly to legacy or not to legacy
iptablesLegacy="$DOCKER_IPTABLES_LEGACY"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't actually test this before I pushed it, but now I have: 😅

$ docker run --rm 3f3f60609d5a |& grep '^iptables '
iptables v1.8.10 (nf_tables)
$ docker run --rm --env DOCKER_IPTABLES_LEGACY= 3f3f60609d5a |& grep '^iptables '
iptables v1.8.10 (nf_tables)
$ docker run --rm --env DOCKER_IPTABLES_LEGACY=1 3f3f60609d5a |& grep '^iptables '
iptables v1.8.10 (legacy)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Giving users the opportunity to decide for themselves is the best option I think.

@stanhu
Copy link

stanhu commented Dec 23, 2023

I repeated @stevexuereb's test with Google COS 105. It works with --env DOCKER_IPTABLES_LEGACY=1, but fails without it, as explained in #467 (comment).

@tianon
Copy link
Member Author

tianon commented Jan 4, 2024

I spent a bunch of time today trying to find a way to detect CONFIG_NF_TABLES being loaded/built-in on the kernel reliably (ie, even if it's set to =Y and thus won't show up in /proc/modules), and came up empty handed. 😞

The more I mess with this, the more I'm of the opinion that we should follow the distros/kernel (for example, Debian 10 / Buster is when Debian switched to nftables, and Ubuntu was definitely switched in 20.04) and just default to nftables with the manual (and temporary!) legacy escape hatch. At most maybe (correctly) implementing the "if legacy rules seem to exist, use legacy" (#468 (comment)), since that's a strong signal that we should use legacy.

What I don't want to do is give the appearance that anything but nftables is actually well-supported, because we don't really know (or have any control over) how long "legacy" will still be available/usable/possible for us to support (the legacy wrappers being removed entirely from Alpine, for example).

Edit: oh. that's basically what I've implemented here, but fixing the -s tests. it's been a few weeks now, my brain's slipping 😭

The nature of `modprobe` in this image is that it works via `ip` hacks, but the exit code will always be non-zero because we don't have `/lib/modules` from the host.

The effect of this was that everyone was using `iptables-legacy` (whether it was warranted for them to be doing so or not).
…les-legacy`

via `--env DOCKER_IPTABLES_LEGACY=1`
@tianon
Copy link
Member Author

tianon commented Jan 4, 2024

With this latest push, I successfully get iptables v1.8.10 (legacy) on CentOS 7 (even with nf_tables loaded and supposedly working) and iptables v1.8.10 (nf_tables) on my Debian host.

@tianon
Copy link
Member Author

tianon commented Jan 4, 2024

Anyone feel like testing this one more time before we merge? 👀 (ie, last call! 😅)

@stanhu
Copy link

stanhu commented Jan 5, 2024

Tested the latest changes with Google Container OS:

  • COS 85: ✅
  • COS 105 needs --env DOCKER_IPTABLES_LEGACY=1, but otherwise ✅

@tianon
Copy link
Member Author

tianon commented Jan 5, 2024

Thanks @stanhu!! It's really appreciated 🙇

@tianon tianon merged commit bfe953e into docker-library:master Jan 5, 2024
8 checks passed
@tianon tianon deleted the better-iptables branch January 5, 2024 23:17
docker-library-bot added a commit to docker-library-bot/official-images that referenced this pull request Jan 5, 2024
Changes:

- docker-library/docker@bfe953e: Merge pull request docker-library/docker#468 from infosiftr/better-iptables
martin-g pushed a commit to martin-g/docker-official-images that referenced this pull request Apr 3, 2024
Changes:

- docker-library/docker@bfe953e: Merge pull request docker-library/docker#468 from infosiftr/better-iptables
@Chickenmarkus
Copy link

Tested the latest changes with Google Container OS:
* COS 85: ✅
* COS 105 needs --env DOCKER_IPTABLES_LEGACY=1, but otherwise ✅

Some more background information for documentation only because the internet search leads to this issue.

It is not an issue with COS 105 in general but with build ID 17412.226.68 and below. It already includes the kernel module nf_tables which is detected and used by the docker startup scripts. However, this kernel module is not functional yet and, thus, the startup scripts fail.
With build ID 17412.294.10 first, the kernel module nf_tables has officially been announced and is functional.

As a result, our GKE 1.26.6-gke.1700 (uses cos-101-17162-210-48) was working before the automatic update in the REGULAR release channel but was not after the upgrade to 1.27.8-gke.1067004 (uses cos-105-17412-226-62).
We had to manually upgrade to the RAPID release channel (1.29.1-gke.1589017 with cos-109-17800-66-78) or pin the version to 1.27.11-gke.1062000 (uses cos-105-17412-294-29).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants