Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using flannel as CNI no longer works #1340

Closed
benmoss opened this issue Feb 18, 2020 · 9 comments
Closed

Using flannel as CNI no longer works #1340

benmoss opened this issue Feb 18, 2020 · 9 comments
Assignees
Labels
kind/external upstream bugs priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@benmoss
Copy link
Contributor

benmoss commented Feb 18, 2020

What happened:
After installing Flannel as my CNI, pods fail to start with the error Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "40a61cb16293c414497736143f53da331ef0dca2236e223f3057bd930d51c1c6": failed to find plugin "flannel" in path [/opt/cni/bin]

What you expected to happen:
Pods work out of the box as before!

How to reproduce it (as minimally and precisely as possible):
Using 0.7.0, create a cluster with networking.disableDefaultCNI: true. Install the Flannel CNI yaml. See that CoreDNS pods in kube-system fail to start.

Anything else we need to know?:
It works in 0.6.1, it looks like 281a20c is what broke it.

Normally Flannel assumes that the flannel CNI binary is included on the host, since it is part of the kubernetes-cni package and the default https:/containernetworking/plugins repo.

@benmoss benmoss added the kind/bug Categorizes issue or PR as related to a bug. label Feb 18, 2020
@BenTheElder
Copy link
Member

this is working as intended, your deployment manifests for flannel should install the requisite CNI binaries at the correct version instead of assuming the host has them

most CNI install daemonsets include a step that copies these binaries out from a container image.

@BenTheElder BenTheElder added kind/external upstream bugs and removed kind/bug Categorizes issue or PR as related to a bug. labels Feb 18, 2020
@BenTheElder
Copy link
Member

disableDefaultCNI should be more clearly marked as a "here be dragons" option, we don't test any external CNI currently, at the moment that's a bit out of scope, but the escape hatch is provided to make it easier for power users to do this anyhow.

regardless of that though, the flannel manifests should really be handling this, as weave, calico, etc. do. even if these binaries are present they may be at the wrong version.

@benmoss
Copy link
Contributor Author

benmoss commented Feb 18, 2020

Yeah, it seems like flannel is the odd one out that has its CNI plugin included in the containernetworking/plugins repo, and so doesn't bother to install it in its deployment manifest. I'll see if I can fix it upstream.

@BenTheElder
Copy link
Member

Thanks! I forgot to add, more importantly this is going to break on other hosts, when 281a20c was filed we'd just had an upstream discussion in kubernetes about how the plugins are re-packaged in the kubernetes debs / RPMs...

The conclusion of the discussion was a decision to stop packaging them in the future in favor of only the plugins directly used by kubelet being included in the kubelet package and letting the user install CNI as needed instead of packaging all the upstream plugins in a special debian depended on by kubeadm.

I'm not sure when this will land, but when it does flannel will need this for a lot of kubeadm based installs.

@benmoss
Copy link
Contributor Author

benmoss commented Feb 18, 2020

This looks like a reasonable start to the paper trail you seem to be talking about https://groups.google.com/forum/#!topic/kubernetes-sig-release/yhf7hAqJEN0/discussion 😸

@BenTheElder
Copy link
Member

BenTheElder commented Feb 18, 2020 via email

@BenTheElder BenTheElder added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Feb 18, 2020
@BenTheElder
Copy link
Member

If you have any luck upstream, this should also likely go in https:/coreos/flannel/pull/1229/files

We had fun with xtables.lock and kind CI flakiness before doing a similar fix in kindnetd.

@benmoss
Copy link
Contributor Author

benmoss commented Mar 10, 2020

I haven't had a lot of luck with the maintainers of flannel, it seems to have been abandoned except for critical fixes. We can close this issue since it's not exactly a kind bug.

@BenTheElder
Copy link
Member

Ack, that's too bad 😞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/external upstream bugs priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

2 participants