Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: ensure connectivity with Kubernetes API on start-up #4641

Merged
merged 1 commit into from
Sep 11, 2023

Conversation

czeslavo
Copy link
Contributor

@czeslavo czeslavo commented Sep 11, 2023

What this PR does / why we need it:

It introduces a Kubernetes API status check on start-up to prevent issues like #4603 and #4207. With the connectivity ensured, we may rely on controller-runtime default behavior (cache sync timeout will be triggered in case of missing CRDs) and drop our manual checks for Kong's CRD existence.

Adds envtests:

  • TestManagerDoesntStartUntilKubernetesAPIReachable to ensure the manager will not start its runnables (including the config sync loop) until the Kubernetes API is reachable.
  • TestNoKongCRDsInstalledIsFatal to ensure that in case of missing Kong CRDs installation, the controller will eventually crash.
  • TestGatewayAPIControllersMayBeDynamicallyStarted to ensure Gateway API CRDs can be installed during the runtime and their controllers will be started dynamically in such a case.

Also, migrates Knative Ingress controller to use DynamicCRCController just to not maintain 3 ways of initializing the controllers.

Which issue this PR fixes:

Should fix #4603 and #4207.

PR Readiness Checklist:

Complete these before marking the PR as ready to review:

  • the CHANGELOG.md release notes have been updated to reflect any significant (and particularly user-facing) changes introduced by this PR

@czeslavo czeslavo self-assigned this Sep 11, 2023
@czeslavo czeslavo added this to the KIC v2.12.0 milestone Sep 11, 2023
@codecov
Copy link

codecov bot commented Sep 11, 2023

Codecov Report

Patch coverage: 85.4% and no project coverage change.

Comparison is base (0e04e18) 68.0% compared to head (13c68b0) 68.1%.
Report is 2 commits behind head on main.

Additional details and impacted files
@@          Coverage Diff          @@
##            main   #4641   +/-   ##
=====================================
  Coverage   68.0%   68.1%           
=====================================
  Files        163     163           
  Lines      19091   19067   -24     
=====================================
- Hits       12992   12987    -5     
+ Misses      5331    5304   -27     
- Partials     768     776    +8     
Files Changed Coverage Δ
internal/cmd/rootcmd/run.go 19.0% <0.0%> (-2.1%) ⬇️
internal/manager/conditions.go 89.6% <ø> (-2.1%) ⬇️
internal/controllers/crds/dynamic_controller.go 73.8% <33.3%> (+56.3%) ⬆️
internal/manager/run.go 55.1% <69.4%> (+1.8%) ⬆️
internal/manager/controllerdef.go 98.7% <100.0%> (-0.2%) ⬇️
internal/manager/setup.go 75.5% <100.0%> (ø)

... and 3 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@czeslavo czeslavo marked this pull request as ready for review September 11, 2023 14:57
@czeslavo czeslavo requested a review from a team as a code owner September 11, 2023 14:57
@czeslavo czeslavo added fix bug Something isn't working labels Sep 11, 2023
Copy link
Contributor

@rainest rainest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd had some concern that OpenShift would do something non-standard for its readiness endpoint, but https:/openshift/installer/blob/master/docs/dev/kube-apiserver-health-check.md confirms it's still /readyz, and building the URL from GetConfig() will deal with their using something other than the kubernetes Service.

We may want to avoid testing the missing CRD cache timeout behavior, since I don't think it's a guaranteed controller-runtime behavior. They do test it upstream, though with a fake cache that simulates no-CRD behavior rather than actually not installing something.

@rainest rainest merged commit ba84ae3 into main Sep 11, 2023
51 checks passed
@rainest rainest deleted the fix/ensure-kubernetes-api branch September 11, 2023 17:51
rainest added a commit that referenced this pull request Sep 11, 2023
randmonkey pushed a commit that referenced this pull request Sep 12, 2023
* chore(docs) add #4641 changelog entry

* chore(controllers) remove Ingress API selection

Remove Ingress API selection in favor of simply honoring the CLI disable
flag.

Support for Ingress versions other than networking/v1 was removed in
#3867. The API
selection code was only selecting between networking/v1 and nothing.
Future Kubernetes versions are not expected to abruptly remove the older
API versions if they introduce new Ingress versions, so this selection
code should not be necessary in the future.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fix size/XL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Race condition with Istio sidecar prevents KIC to startup correctly
2 participants