Race condition with Istio sidecar prevents KIC to startup correctly #4603

gallolp · 2023-09-04T19:06:36Z

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

This is similar to what is observed in #4207 .

Due to the new controller startup logic described here and in this PR, if the network is not available when the ingress-controller container starts and it can't connect to the k8s control plane then all the controllers are disabled.

When the ingress-controller container starts and attempts to get k8s resources before the envoy sidecar is ready, it fails and the no routes are added to the Kong proxy instances.

Expected Behavior

The controller should retry or fail (restart) when the control plane is unavailable at boot.

Steps To Reproduce

- Deploy Kong with Istio sidecar. For example deploy using the Helm chart annotating the namespace with istio injection.
- Wait for the race condition to happen

Kong Ingress Controller version

Tested positive on 2.10.x.
Unable to reproduce with 2.7.x.
Should happen on 2.8.x and up.

Kubernetes version

Tested on 1.23 and 1.26.

Anything else?

Sample logs. Check timestamps.

Istio sidecard (extract):

2023-09-04T17:49:40.380745Z    info    Envoy proxy is ready

Ingress controller container logs (extract):

{"level":"info","logger":"controllers.crdCondition","msg":"Disabling controller for Group=configuration.konghq.com/v1beta1, Resource=udpingresses due to missing CRD","time":"2023-09-04T17:49:37Z"}
{"level":"info","logger":"controllers.crdCondition","msg":"Disabling controller for Group=configuration.konghq.com/v1beta1, Resource=tcpingresses due to missing CRD","time":"2023-09-04T17:49:37Z"}
{"level":"info","logger":"controllers.crdCondition","msg":"Disabling controller for Group=configuration.konghq.com/v1, Resource=kongingresses due to missing CRD","time":"2023-09-04T17:49:37Z"}

This issue seems to be tracked here in Istio. One of the proposed solutions here is the use of postStart lifecycle hooks.

Maybe the KIC can implement either:

a retry logic when the control plane is unavailable
a fail logic (distinguish CRD not present from API call failed)

Or if that is not possible maybe the helm chart can add support for lifecycle hooks for the ingress-controller container like it does for the proxy container.

The text was updated successfully, but these errors were encountered:

pmalek · 2023-09-06T14:15:52Z

#4618 might be a better solution than trying to implement it in the chart.

gallolp · 2023-09-06T15:09:38Z

Having the retry logic in the controller would be ideal. It would help in this Istio case and any other case of network outage/delay at pod start.

The container lifecycle hook is just a workaround and it has proven to be ineffective in some cases.

Thank you for looking into this.

gallolp added the bug Something isn't working label Sep 4, 2023

gallolp mentioned this issue Sep 5, 2023

feat: Add lifecycle hook support to ingress container Kong/charts#880

Closed

2 tasks

This was referenced Sep 8, 2023

feat: use DynamicCRDController with Kong controllers #4619

Closed

fix: ensure connectivity with Kubernetes API on start-up #4641

Merged

rainest closed this as completed in #4641 Sep 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Race condition with Istio sidecar prevents KIC to startup correctly #4603

Race condition with Istio sidecar prevents KIC to startup correctly #4603

gallolp commented Sep 4, 2023

pmalek commented Sep 6, 2023

gallolp commented Sep 6, 2023

Race condition with Istio sidecar prevents KIC to startup correctly #4603

Race condition with Istio sidecar prevents KIC to startup correctly #4603

Comments

gallolp commented Sep 4, 2023

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Kong Ingress Controller version

Kubernetes version

Anything else?

pmalek commented Sep 6, 2023

gallolp commented Sep 6, 2023