Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add lifecycle hook support to ingress container #880

Closed
wants to merge 1 commit into from

Conversation

gallolp
Copy link

@gallolp gallolp commented Sep 5, 2023

What this PR does / why we need it:

Lifecycle hooks on the ingress-controller container can help in several cases, like:

Which issue this PR fixes

(optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged)

Special notes for your reviewer:

Modified the ingress-controller container template, the value file and the readme.
Thank you for considering the option to add lifecycle hooks to the ingress-controller container.

Checklist

[Place an '[x]' (no spaces) in all applicable fields. Please remove unrelated fields.]

  • PR is based off the current tip of the main branch.
  • New or modified sections of values.yaml are documented in the README.md

@gallolp gallolp requested a review from a team as a code owner September 5, 2023 17:00
@CLAassistant
Copy link

CLAassistant commented Sep 5, 2023

CLA assistant check
All committers have signed the CLA.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@rainest
Copy link
Contributor

rainest commented Sep 6, 2023

Per https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#container-hooks

PostStart
This hook is executed immediately after a container is created. However, there is no guarantee that the hook will execute before the container ENTRYPOINT. No parameters are passed to the handler.

so lifecycle wouldn't actually block the entrypoint (the part that's failing), just run something around the same time. Were you actually seeing different behavior in practice?

AFAICT the Istio workarounds do block other parts of the lifecycle. It looks like adding the PostStart hook would block the Pod from entering the running state, and presumably keep it out of a Service's endpoints, but that doesn't matter for the controller, since it's not used for a Service.

First-class support for sidecar containers looks like a better solution, since the sidecars become a special sort of initContainer and do block main container startup, though that's only recently become available. Do you have access to Kubernetes 1.28 instance where you can enable feature flags to try adding a sidecarContainers injection point instead?

@gallolp
Copy link
Author

gallolp commented Sep 6, 2023

I have been testing the lifecycle workaround for a day with mixed results. I will try to answer the best I can.

Per https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#container-hooks

PostStart
This hook is executed immediately after a container is created. However, there is no guarantee that the hook will execute before the container ENTRYPOINT. No parameters are passed to the handler.

so lifecycle wouldn't actually block the entrypoint (the part that's failing), just run something around the same time. Were you actually seeing different behavior in practice?

Correct, the hook runs asynchronously so it does not stop the container to start. However, upon failure of the hook the container is restarted. The idea behind the workaround is to keep restarting the container until the Envoy sidecar is ready.

Sadly, that asynchronicity also allows for a scenario in which it doesn't work as intended. I have seen the following happen under heavy load:

  1. The container starts and begins running the code (the setup with the api discovery that fails when there is control plane outage)
  2. The envoy sidecar becomes ready
  3. The postStart hook executes

In this order the container becomes ready but the ingress is in a bad state since the setup run with no control plane access.

AFAICT the Istio workarounds do block other parts of the lifecycle. It looks like adding the PostStart hook would block the Pod from entering the running state, and presumably keep it out of a Service's endpoints, but that doesn't matter for the controller, since it's not used for a Service.

Indeed, while the ingress container is restarted the Pod can't become ready. But this is the desired behavior. Without the hook the Pod becomes ready immediately after the containers are running but the proxy is not configured because the ingress setup failed and the users receive the "No routes matched" error. We are trying to avoid that.

First-class support for sidecar containers looks like a better solution, since the sidecars become a special sort of initContainer and do block main container startup, though that's only recently become available. Do you have access to Kubernetes 1.28 instance where you can enable feature flags to try adding a sidecarContainers injection point instead?

I don't currently have access to a 1.28 cluster. I'm working with 1.23 soon to be updated to 1.26 clusters. The feature looks interesting though.

With all this being said, I understand if you don't want to add the lifecycle config to the ingress controller. It is not a definitive solution, just a partial workaround to some problems.

The introduction of the new controller logic in the 2.8.x release is preventing us from upgrading our Kong installations as there is no way we can guarantee network availability upon pod start (specially under heavy load) and the controller does not retry nor checks for control plain availability before setup.

At this point I am thinking of maintaining a custom KIC image and add some check script at the entrypoint.

Anyway, thanks for reading and for all the help.

@rainest
Copy link
Contributor

rainest commented Sep 11, 2023

Kong/kubernetes-ingress-controller#4641 will just exit if the API server is unreachable, which should resolve the Istio issue.

@rainest rainest closed this Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kong ingress-controller container missing lifecycle configurations
3 participants