Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The remove-node.yml playbook doesn’t clean up static pods and DaemonSet pods #11627

Open
imo-ininder opened this issue Oct 11, 2024 · 3 comments · May be fixed by #11631
Open

The remove-node.yml playbook doesn’t clean up static pods and DaemonSet pods #11627

imo-ininder opened this issue Oct 11, 2024 · 3 comments · May be fixed by #11631
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@imo-ininder
Copy link

What happened?

When using remove-node.yml to remove the node, I expect the node to revert to the state it was in when we initially installed it. However, static pods and DaemonSet pod containers are left on the node. When we try to reuse the node, I encounter a port already in use error with Nginx. We need to ssh into that node and kill the containerd-shim-v2 process to solve it.

What did you expect to happen?

The remove-node.yml should perform the same behavior as reset.yml.
Add a step to force all containers to stop.

How can we reproduce it (as minimally and precisely as possible)?

Use cluster.yml to install a cluster with nginx-proxy installed.
Use remove-node.yml to remove one of the worker nodes.
Then, use scale.yml to scale the same node back into the cluster.

OS

Linux 5.15.0-116-generic x86_64
PRETTY_NAME="Ubuntu 22.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

Version of Ansible

ansible [core 2.16.8]

Version of Python

Python 3.10.12

Version of Kubespray (commit)

7e0a407

Network plugin used

calico

Full inventory with variables

Command used to invoke ansible

See reproduce

Output of ansible run

Anything else we need to know

No response

@imo-ininder imo-ininder added the kind/bug Categorizes issue or PR as related to a bug. label Oct 11, 2024
@tico88612
Copy link
Member

I want to confirm some things:

  • Can you check what command you used to remove the node?
  • Did the /etc/kubernetes folder exist when you removed the node?

@tico88612
Copy link
Member

tico88612 commented Oct 13, 2024

I reproduced this problem, and It looks like the previous containers weren't removed.

EDIT: It is not limited to v2.25; it is still in the master branch.

@tico88612
Copy link
Member

/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants