-
Notifications
You must be signed in to change notification settings - Fork 591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KIC 2.0 final controller mananager flag checkup #1580
Comments
What keeps me unhappy about
is that this way we'll lose the distinction between "fail if desired but not working" and "ignore silently if not working". @shaneutt can you think of a solution that satisfies the reason behind your intention to switch from |
Would you mind please describing the problem to solve in terms of acceptance criteria (end result, UX, e.t.c.) so that I can try to make adjustments to satisfy it? |
Acceptance criteria that come to my mind are:
|
I would argue that making these booleans, and then failing the program entirely when
Which to me seems to be along the same lines are you're suggesting. I'm personally not clear on the value of "maybe enable this if" from the end-user perspective, though I'm trying to remain open to suggestion. In the future though, I can see the value of "eventually" (as opposed to maybe) enable this, re: #1449 I want to see the behavior when KIC 2.0 launches to be clear and crisp: you either want a controller/api or you don't. And if you do, but the API isn't available in the cluster we signal red alert and action needs to be taken. |
I figured I could potentially communicate better via code, here's a draft PR to illustrate and use as a conversation basis: It was quick and easy to make, so we can iterate on it, close it, e.t.c. whatever seems to fit best. |
Given a zoom conversation with @mflendrich I've removed the original proposed solution from the description and it can instead be seen in #1582 (which I think we're ultimately going to decline as a solution, but for posterity). |
While I think the technical approach to AutoHandlers in #1585, I do think the option to change to booleans makes more sense for UX. On our end, we want to be opportunistic with optional functionality. Not everyone has Knative installed, but for those users that do, we want to make our support readily available. We default its controller on as such, but that introduces an implementation challenge, since loading it when Knative isn't installed results in a startup failure. Failing to start with a default configuration in the many clusters that don't have Knative is bad UX, so we instead use feature detection and degrade gracefully, disabling it automatically. We can reasonably assume that users who don't have Knative installed are okay with that controller not loading (after all, it wouldn't do anything if it did). We're not ignoring user intent, since attempting to load that controller is a default behavior behavior: the user hasn't explicitly instructed us to load it, we just try to be opportunistic. We furthermore do log what actually happens, so this isn't silent. There shouldn't be a case where you need to override the auto-loader and force us to load the controller, since it just results in a startup error. If we did provide the option, changing from Booleans are simpler to understand and allow us to present a more uniform config for all controllers: each only has the single option (disable it) and the behavior is consistent across all controllers (the controller just doesn't load). Auto is more harder to understand since it's not available everywhere (most controllers just don't support it) and is handled differently when it is available (the Knative/KCP auto-disable versus the Ingress version negotiation). |
I do tend to agree with @rainest on this point. And to add on that we can always add options in the future to add specific "automatic" controller features, but more explicitly and only when we're sure we need them. We're committed to more than I think we need to be in 2.0 with having this flag represent 3 possible states, and one of them reflecting "automatic setup" which I'm just not sure I feel is clearly defined enough yet. |
I agree with the general spirit of your messages @rainest @shaneutt (that a boolean would make for optimal UX) but the devil is in the details:
I think that the cleanup happening in #1585 should be uncontroversial (regardless whether we go for binary or ternary enablement statuses, we need the logic across controllers to be consistent, and 1585 brings that consistency to AutoHandlers). If we opt for switching from ternary to binary, that could be technically implemented by dropping the |
IMO this is fine. We have inconsistency somewhere--either we have controllers that behave differently when you set enabled or we have controllers that don't allow you to set auto (or if we add it to all, where setting auto has no useful purpose, and simply lets you hit the problem scenario in the second bullet). Absent a perfectly consistent solution that handles the split between controllers we definitely want and controllers we maybe want, I'd lean towards the more consistent config surface, since that's what most users will see.
Reasonable, though we should have a better means of alerting users to this before they attempt an upgrade and it fails to complete/leaves the old ReplicaSet running. Deprecation warnings can go unheeded, and a global "fail if CRDs missing" flag isn't really going to help if we use an ignore-if-absent AutoHandler throughout (there's no way it plays nice with the CRDs we actually want want to ignore when missing). For controllers where we don't ever expect the CRD to be absent, we get a solution (albeit the rather clunky "upgrade fails" solution) for free if we never add a true AutoHandler. For those where we want to maybe load, I'm not sure what a good solution is.
Agreed--that sounds fine; we can make the refactor to make the AutoHandler function a controller property rather than special cases in |
Rephrasing some of my responses from earlier since hypothetical UX stuff isn't always the easiest to parse and benefits from different explanations, and because I've had more time to think over it: Enabling can result in a soft failure I'd argue this shouldn't be framed as a failure, at least from a user perspective. In a strict technical sense we fail to load an enabled controller, but that occurs inside our implementation, and you, the user don't care. The net effect of our failing to load the Knative controller is that we cannot generate configuration from any Knative resources. This is fine! There are no Knative resources to load! You can't possibly have created any, because you haven't installed the CRDs necessary to create them. Change in supported controllers results in unexpected configuration change While a valid concern that requires a well-explained upgrade path and prominent warnings, this doesn't occur because an enabled controller can't see its CRD and doesn't load. This occurs when we remove support for an API version, e.g. we've created a KongPlugin Upgrading to a version that only supports If we do have an opportunistic The murky area is version negotiation, where we try to load the newest available version of an API and ignore older versions. I'm not 100% clear on what should happen if you have an environment that has both You end up in a bad place if you have |
This problem is going away soon: #1666
For all 3 versions of Ingress we currently support, the apiserver will present the exact same set of Ingress resources. |
Is there an existing issue for this?
Problem Statement
Before we release the KIC 2.0 beta we need to do one last pass over all the controller related options for the manager, in particular the enablement settings available for each and either change the behavior or ensure consistency of the behavior using the 3 setting approach.
Additional information
No response
Acceptance Criteria
The text was updated successfully, but these errors were encountered: