Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tainting and untainting logic implemented via configuration #565

Closed
wants to merge 18 commits into from

Conversation

bilalcaliskan
Copy link

Which issue(s) this PR fixes:

this pr solve issue #457 .

What this PR does / why we need it:

This PR adds functionality of tainting and untainting a node on specific circumstances conditionally. With that improvement, node-problem-detector can be used in conjunction with descheduler. User should specify taintEnabled, taintKey, taintValue, taintEffect in config/kernel-monitor.json. If not specified, taintEnabled is false, so npd will not taint any node. With that improvement, node-problem-detector also removes taint if problem is resolved.

Special notes for your reviewer:

This improvement needs update Clusterrole to node-problem-detector. If that PR somehow merged to the master, update verb must be added to right here.

@k8s-ci-robot k8s-ci-robot added do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 22, 2021
@k8s-ci-robot
Copy link
Contributor

Hi @bilalcaliskan. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 22, 2021
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 22, 2021
@bilalcaliskan bilalcaliskan force-pushed the master branch 2 times, most recently from 8d48e33 to e204891 Compare May 22, 2021 20:49
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label May 22, 2021
@bilalcaliskan bilalcaliskan reopened this May 22, 2021
@bilalcaliskan
Copy link
Author

i have modified the commit message so there are lots of activity since PR is opened, sorry for that.

@bilalcaliskan
Copy link
Author

/retest

@k8s-ci-robot
Copy link
Contributor

@bilalcaliskan: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bilalcaliskan
Copy link
Author

/retest

@k8s-ci-robot
Copy link
Contributor

@bilalcaliskan: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bilalcaliskan
Copy link
Author

/assign @xueweiz @andyxning

@Random-Liu
Copy link
Member

Random-Liu commented Jun 25, 2021

We discussed this long time back when we firstly started NPD.
The taint can be applied by NPD itself, and can also be applied by another controller that takes NPD generated conditions and events as signal.
Having another controller is the pattern most people use right now, because the cluster level controller should have a global view and a better understanding of whether it is OK to taint a node or not.
For example, if 5 out of 10 nodes have an issue, you may not want NPD on each node to taint the node, that may make a lot of pods nowhere to run. Instead, a controller could drain the bad nodes one by one and recreate/repair them.

@azman0101
Copy link

According to the remedy system section, NPD is able to add condition to node that eventually will taint them, then descheduler will evict pods that doesn't respect the taint.

Or we could be taking advantage of Taint Based Eviction instead of descheduler.
https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-based-evictions

Currently, descheduler support RemovePodsViolatingNodeTaints and NPD can add condition nevertheless, there is no way to rely only those two compontents to drain nodes automatically. Draino is still required, am I right ?

@Random-Liu
Copy link
Member

Condition is mostly for informative.

Taint will actually affect the scheduler decision, e.g. not schedule any pod to a node any more, or evict running pods from a node. That kind of decision should be done by the cluster level controller, or else if in the extreme case half nodes decide to taint and evict pods, the cluster may not have enough resource to run those pods.

@azman0101
Copy link

azman0101 commented Jun 30, 2021

Condition is mostly for informative.

Taint will actually affect the scheduler decision, e.g. not schedule any pod to a node any more, or evict running pods from a node. That kind of decision should be done by the cluster level controller, or else if in the extreme case half nodes decide to taint and evict pods, the cluster may not have enough resource to run those pods.

I agree 👍
I'm just saying that Remedy Systems section is misleading because it says that NPD + Descheduler can do the job together.

I want to avoid using draino so, I wonder how to taint by custom condition 🤔

@alexispires
Copy link

NPD can works with the descheduler only with the predefined (Ready, MemoryPressure...) conditions. So, the current design is in my opinion definitively limited. I understand the decision of spliting responsability but the documentation leads to mesleading.

@bilalcaliskan
Copy link
Author

I agree with @azman0101, descheduler does not do tainting by itself. According to remedy systems section we can use descheduler for that purpose but sadly i guess there is just one option for that and its draino.

@Random-Liu as i know, Descheduler does the job only if there are specified taints on the node. So we should use 3 different component to taint nodes on specific NodeCondition(npd, draino and Descheduler in that scenario)? I guess its too much efford especially if you have over 10 k8s clusters on production.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 17, 2023
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 17, 2023
@vfiftyfive
Copy link

What is the status on this?

@bilalcaliskan
Copy link
Author

there is no more work left on the development side i guess but waiting for a maintainer review.

@sennerholm
Copy link

@btiernay Could you help finding a maintainer that have the time to push this over the finish line?

@btiernay
Copy link
Contributor

btiernay commented Nov 1, 2023

@andyxning @wangzhen127 @xueweiz @vteratipally @mmiranda96 Are you available to help review this great addition and help to push it over the finish line? 🙏

@btiernay
Copy link
Contributor

btiernay commented Nov 8, 2023

Kindly requesting your review, one more time 🙏. Thank you 🙇.

@sebastiangaiser
Copy link

Hey, what is the current status of this PR? Would be great so see this merged 🙏🏻

@btiernay
Copy link
Contributor

btiernay commented Apr 1, 2024

/retest-required

@k8s-ci-robot
Copy link
Contributor

@bilalcaliskan: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-npd-e2e-kubernetes-gce-ubuntu-custom-flags c3bea3b link true /test pull-npd-e2e-kubernetes-gce-ubuntu-custom-flags
pull-npd-e2e-kubernetes-gce-ubuntu c3bea3b link true /test pull-npd-e2e-kubernetes-gce-ubuntu
pull-npd-e2e-kubernetes-gce-gci c3bea3b link true /test pull-npd-e2e-kubernetes-gce-gci
pull-npd-e2e-kubernetes-gce-gci-custom-flags c3bea3b link true /test pull-npd-e2e-kubernetes-gce-gci-custom-flags

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@btiernay
Copy link
Contributor

btiernay commented Apr 1, 2024

@bilalcaliskan Seems like tests are failing due to build issues. Maybe need to rebase?
image

@jonasbadstuebner
Copy link

@bilalcaliskan Would be really nice to have this!

@wangzhen127
Copy link
Member

Should we close this PR without merging it?

I agree with @Random-Liu's #565 (comment). NPD does not necessarily want to affect the scheduling decisions. For example, we already have examples like Taint Nodes by Condition, in which the node controller taints nodes based on node conditions. This should be the recommended model.

@nvermande
Copy link

That makes NPD pretty much useless IMO. Thank you for confirming. Plus given that PR has been open for 4 years, that raises even more concerns about using it in production.

@btiernay
Copy link
Contributor

btiernay commented May 4, 2024

Agree with @nvermande's assessment. The lack of this capability makes NPD practically much harder to use, integrate, and be successful with. I can understand why actuation isn't in scope, but imo not having (un)tainting support means poorer ecosystem interop and more work for users.

@dogzzdogzz
Copy link

Can we introduce an "enabled" flag to allow users to determine whether the node-problem-detector should taint a node and affect scheduling decisions? By default, this feature can be set to false. Additionally, we can provide the considerations in the README to help users evaluate whether to enable this feature.

@sebastiangaiser
Copy link

sebastiangaiser commented May 11, 2024

@wangzhen127 the readme suggests to use Draino (no commit since 4 years on the master branch - IMO should be removed from the readme), mediK8S (brings it's own ecosystem) and MachineHealthCheck (related to ClusterAPI) as Remedy Systems next to Descheduler. The outdated Draino makes Descheduler "useless" IMO. So for me rejecting this PR (for ~3 years now) feels like this project is EOL.

in the extreme case half nodes decide to taint and evict pods, the cluster may not have enough resource to run those pods

True but I think is upon the user using this feature - so it should be documented. An additional idea would be to add labels to the nodes like tainted-by: node-problem-detector and verifying that X% or a number of nodes could be tainted by node-problem-detector at the same time.

...we already have examples like Taint Nodes by Condition...

True but these are very basic IMO. Getting this PR merged would give users the possibility to define them on their own.

Examples which come to my mind in a short time for this could be:

  • CNI has problems
  • Router for BGP gets miss-configured
  • broken hardware of the nodes when running on-premise

There are several use-cases where this feature makes absolutely sense without creating a new controller because then we could simply fork this project...

@wangzhen127
Copy link
Member

wangzhen127 commented May 13, 2024

As far as I know, NPD has been used by several cloud providers and products in production for many years. The reason why this approach is not recommended has been clearly stated previously. Adding this as an optional feature could work in some cases, but it could also be abused and eventually harm ourselves. Given there are several people feeling strongly about this feature, I suggest to bring this issue to the wider community in the sig-node weekly meeting for feedback. Please let me know when you plan to discuss this. Thanks!

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 11, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle rotten
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 10, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.