Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTA-1209: enhancements/update/channel-rename-generally-available: New enhancement #1559

Closed
wants to merge 1 commit into from

Conversation

wking
Copy link
Member

@wking wking commented Feb 7, 2024

Trying to package OCPSTRAT-1153 as an enhancement to explore motivation, risks, alternatives, and impacted components. The Feature itself does not currently spend much time on motivation, so most of that is me guessing from my own experience, and not necessarily a reflection on why the business folks want to perform this rename.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Feb 7, 2024

@wking: This pull request references OTA-1209 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to this:

Trying to package OCPSTRAT-1153 as an enhancement to explore motivation, risks, alternatives, and impacted components. The Feature itself does not currently spend much time on motivation, so most of that is me guessing from my own experience, and not necessarily a reflection on why the business folks want to perform this rename.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Feb 7, 2024
Copy link
Contributor

openshift-ci bot commented Feb 7, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign lalatendumohanty for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@wking wking force-pushed the channel-rename branch 3 times, most recently from 8f0910f to 9faaa9f Compare February 7, 2024 05:07
@petr-muller
Copy link
Member

/cc

@wking wking force-pushed the channel-rename branch 2 times, most recently from f21e7af to b850684 Compare February 7, 2024 16:06

Microshift manages installs via RPMs, without using an update service, so channels are not relevant there.

Managed OSD/ROSA collapse channels down into channel-groups, so we'll need to work with them on that.
Copy link
Contributor

@jharrington22 jharrington22 Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We enforce the stable- channel across the fleet and curate edge availability by channel in ClusterImageSets. We're going to need to change this for all 4.16 clusters to fleet-approved-4.y and ensure its in place for any cluster being installed (OCM) into 4.16 and upgrading into 4.16 before we'd be able to GA it.

Sounds like this will probably break in CI for Managed OpenShift during provisioning in nightlies. cc/ @vkareh @pvasant

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maorfr for APPSRE awareness

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it will, however we can provide forward compatibility at the API layer by ensuring that any request carrying a channel-group=stable gets translates to fleet-approved internally. Essentially supporting both the old and new wording. This would need to be done prior to the 4.16 CIS being pushed, otherwise OCM will ignore them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a link to managed-upgrade operator's inferUpgradeChannelFromChannelGroup and pointed out that OCM has a similar translation layer that it uses for hosted ROSA (which doesn't use MUO).

And yes, also the ClusterImageSet api.openshift.com/available-upgrades annotation, which I guess I can mention by name, even if there is no public code to link.

Anything else I'm missing that still needs capturing in the enhancement text, aside from the general exposure of "until we can test, we won't be sure we've noticed all the bits that need bumping"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So applying an update requires that the version is in both CIS and available updates on the cluster? How does that work when the cluster is for instance 4.12.40 today and seeking to upgrade to 4.13? Do we just arbitrarily put all clusters into stable-4.Y+1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we just arbitrarily put all clusters into stable-4.Y+1?

The managed-upgrade operator flips them into stable-4.(y+1) just before the update (or other channel based on the target release version and this channel-group-to-channel mapper, to give the CVO... well, maybe they don't wait for the CVO to form an opinion...

@wking wking force-pushed the channel-rename branch 3 times, most recently from 602f188 to 7c7c706 Compare February 8, 2024 17:08
@LalatenduMohanty
Copy link
Member

@vkareh you might want to take a look


#### Managed ClusterImageSet exposure

The current ClusterImageSet configuration assumes the current channel-naming pattern will continue, and it will need updates to understand the pivot to the new names.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a dependency on OCM/Cluster-Service here. If CS is not ready to accept the new names, changes in CIS will likely get ignored.

## Motivation

Some users find the `fast-4.y` vs. `stable-4.y` distinction confusing, even with the context of [the doc section comparing the two channels][fast-stable-channel-strategies].
This enhancement renames those channels to make the distinction more clear, for situations where the user does not have pointers to the additional context of that documentation.
Copy link
Member

@LalatenduMohanty LalatenduMohanty Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some suggestion to enhance the text.

The names of the existing channel names created some confusion among customers. Fast channel is not fast anyway. We only release builds to fast channel after the GA so it is no way fast for our customers. Stable channel is not more stable than fast channel. Because when we add any conditional risk for upgrade paths we add it for all the channels. The only difference between fast and stable channel is we promote versions to stable channel with some delay. There are no other difference exists between these two. With the name change we hope to communicate the actual intent of the channels with our users.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In c850c3b, I have quoted in the docs line about the promotion delay being the only difference, which hopefully addresses some of your concerns.

I don't understand the benefit of your "Fast channel is not fast anyway" suggestion. Can you explain what that is trying to express in more detail? I expect that the now-quoted only-difference-line will convey that both are generally-available, but the main point here is to assert that some users seems confused by the distinction, and I dunno if we necessarily need to attempt to itemize the ways in which the distinction has been misunderstood.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the benefit of your "Fast channel is not fast anyway" suggestion.

I do not know how to justify the fast name without saying stable is slow. We only release stuff once it is GA, that is not a definition of fast version or no way communicating that it is fast.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLDR the fast is name is orthogonal to what velocity we release OpenShift versions, so it is misleading or not communicating the right kind of message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fast-4.y gets 4.y.z and associated update recommendations before stable-4.y, so it is fast relative to that cooking-in-fast delay. But again, do we need to break down "some folks think fast outpaces GA, when it's actually triggered by the GA-declaring errata" in this Motivation paragraph? Or can we handwave more vaguely about the current channel names being related to confusion? Because while "stop using the fast-4.y pattern" is one way to address the confusion (and there are several options in that direction), there are certainly also options that reduce the confusion but which do not remove the fast-4.y name (the Increase access to channel semantics alternative subsection pitches one of these).

@dhellmann
Copy link
Contributor

#1555 is changing the enhancement template in a way that will cause the header check in the linter job to fail for existing PRs. If this PR is merged within the development period for 4.16 you may override the linter if the only failures are caused by issues with the headers (please make sure the markdown formatting is correct). If this PR is not merged before 4.16 development closes, please update the enhancement to conform to the new template.

@wking wking force-pushed the channel-rename branch 8 times, most recently from a636d83 to 29e4689 Compare February 13, 2024 20:23
Copy link
Contributor

@patrickdillon patrickdillon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm from an installer perspective. Thanks for the heads up.

enhancements/update/channel-rename-generally-available.md Outdated Show resolved Hide resolved
@wking wking force-pushed the channel-rename branch 2 times, most recently from 67d30bd to 616d28e Compare February 17, 2024 00:13
@wking wking force-pushed the channel-rename branch 4 times, most recently from d161a9e to 414d3c1 Compare February 21, 2024 22:44

Very relevant to standalone clusters too, since that's the original channeled-update topology.

#### Hive
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wking wking force-pushed the channel-rename branch 6 times, most recently from 0e38a52 to 1ed35cd Compare February 22, 2024 01:16
Trying to package [1] as an enhancement to explore motivation, risks,
alternatives, and impacted components.  The Feature itself does not
currently spend much time on motivation, so most of that is me
guessing from my own experience, and not necessarily a reflection on
why the business folks want to perform this rename.

[1]: https://issues.redhat.com/browse/OCPSTRAT-1153
Copy link
Contributor

openshift-ci bot commented Feb 23, 2024

@wking: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

The cluster-version operator could be taught to alert on deprecated channel names.
Although currently it is not clear if the old naming patterns will be deprecated, or remain GA forever, more in [discontinuing the aliases](#discontinuing-the-aliases).

#### Continuous integration
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm these findings. It seems that the risk from CI point of view is low. I can't exclude however some situations (especially legacy, historic etc) that will occur once the change is introduced. It won't stop me from giving LGTM from our POV though.

@openshift-bot
Copy link

Inactive enhancement proposals go stale after 28d of inactivity.

See https:/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 3, 2024
@openshift-bot
Copy link

Stale enhancement proposals rot after 7d of inactivity.

See https:/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 10, 2024
@openshift-bot
Copy link

Rotten enhancement proposals close after 7d of inactivity.

See https:/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

@openshift-ci openshift-ci bot closed this Apr 18, 2024
Copy link
Contributor

openshift-ci bot commented Apr 18, 2024

@openshift-bot: Closed this PR.

In response to this:

Rotten enhancement proposals close after 7d of inactivity.

See https:/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dhellmann
Copy link
Contributor

(automated message) This pull request is closed with lifecycle/rotten. The associated Jira ticket, OCPSTRAT-1153, has status "Refinement". Should the PR be reopened, updated, and merged? If not, removing the lifecycle/rotten label will tell this bot to ignore it in the future.

1 similar comment
@dhellmann
Copy link
Contributor

(automated message) This pull request is closed with lifecycle/rotten. The associated Jira ticket, OCPSTRAT-1153, has status "Refinement". Should the PR be reopened, updated, and merged? If not, removing the lifecycle/rotten label will tell this bot to ignore it in the future.

@wking wking deleted the channel-rename branch April 26, 2024 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.