Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pilot: Use the DEPR process for breaking changes #595

Open
kdmccormick opened this issue Jun 20, 2024 · 13 comments
Open

Pilot: Use the DEPR process for breaking changes #595

kdmccormick opened this issue Jun 20, 2024 · 13 comments
Assignees

Comments

@kdmccormick
Copy link
Member

kdmccormick commented Jun 20, 2024

Maintenance WG is trying out a new way of using the DEPR process. We'll takes notes here. If it goes well, then we will update the DEPR OEP accordingly. The problem we're trying to solve that we don't want to increase the number of channels that an operator has to follow to find out about removal, breaking changes or operational impact of changes. We'll consider this experiment a success if we can manage the next few upcoming maintenance items with breaking changes without significantly having to contort the DEPR process to make this work.

Here's the idea:

  • All breaking changes warrant a DEPR ticket. Not only feature removals, but also upgrades and public API changes.
  • All consumers of edx-platform should have a 6 month window to handle breaking changes, including both folks using named releases and folks using main.
  • When possible, we should think of breaking changes in terms of expand/react/contract
  • When a DEPR is communicated, it should include the following:
    • Communication Date (beginning of comment period)
    • Acceptance Date (~2 weeks from Communication)
    • Target date+release for "expand", ie replacement availability (if applicable).
      • In the case of an upgrade (e.g. Node 18), the replacement is the new version (e.g. Node 20)
      • In the case of an API breakage, the replacement would be the new/updated API.
      • In the case of a total removal, there is no "expand" moment, so the 6 month timer starts at Acceptance.
        • Q: Should total removals have a longer comment period?
    • Target date+release for "contract", ie removal of the deprecated thing
    • The "react" window (the time between "expand" and "contract") is 6 months by default
      • We want to have a way of easily seeing all the tickets in this state. This is essentially a To Do list for operators, especially operators working off of main.
  • DEPRs ticket should remain oriented around removals/breakages. For example, we wouldn't make a DEPR ticket for "Adding Python 3.12 support", but we would have a DEPR ticket for "Dropping Python 3.11 support"
@kdmccormick
Copy link
Member Author

@robrap @feanil That is my quick writeup on what we're doing, feel free to edit it or add your own notes.

@robrap
Copy link
Contributor

robrap commented Jun 20, 2024

Thanks @kdmccormick. This is great. We also mentioned that we plan to expose some of the dates as metadata on the DEPR board so organizations can easily see by when they need to handle the different states.

@sarina
Copy link
Contributor

sarina commented Jun 20, 2024

Happy to see this evolving, I will silently keep tabs and happy to help with the OEP authoring/reviewing if this experiment succeeds.

I'd love to see a quick sentence or two in the description about (a) what problem this is trying to solve, and (b) what "success" of the experiment is.

@feanil
Copy link
Contributor

feanil commented Aug 1, 2024

We'll check back-in on the pilot in general in Jan 2025

@robrap
Copy link
Contributor

robrap commented Aug 7, 2024

In a WG Meeting, we mentioned that we think archiving a github repo would not require the 6 month warning.

  • The redirects are handled, so this shouldn't be a breaking change.
  • We didn't discuss how long of a warning should be provided.
  • If an org actively updates the to-be-archived repo, they may need to do work quickly so that they can fork and get off the archived repo.
  • We noted that it would be useful to make it clear in docs that the repo has been archived and should not be used:
    • Adding a banner to its docs noting that it is archived would be helpful.
    • Updating the README would be helpful.
    • Docs tracking Tutor plugins should note plugins that are deprecated.
    • Any other relevant doc index referring to the repo should account for the deprecation.

@feanil
Copy link
Contributor

feanil commented Sep 5, 2024

In the Maintenance WG we spoke about the DEPR process doing two things:

  • It tells people that a breaking change is happening
  • It provides a place to collect feedback on the change

@brian-smith-tcril
Copy link
Contributor

Addendum: Ticket scope

DEPR tickets should be made to cover an appropriate amount of the codebase. I will use dropping Node 18 support as an example to help define "appropriate" here:

Too Large

  • Drop Node 18 support in every repository throughout the Open edX GitHub organization.
    • With a scope this large the "expand" window could easily take too long. Snags hit during implementation in some repositories could prevent unrelated repositories moving into the "react" window.
    • With a scope this large there is likely not a unified strategy for addressing the DEPR. Some repositories might no longer need to use Node at all. Some repositories might not be able to support Node 18 and Node 20 in parallel.

Too Small

  • Drop Node 18 support for the analyze-dependents.yml workflow in Paragon.
    • This is an extreme example, but having a DEPR ticket for every GitHub actions workflow in every repository would be overwhelming.
  • Drop Node 18 support in frontend-app-learner-dashboard
    • While not as extreme an example as "per workflow," a DEPR ticket per repo can also be problematic. Site operators should not need to think about which MFEs still work on Node 18 and which ones don't.

Just Right

  • Drop Node 18 support for Tutor-supported MFEs and supporting libraries
    • By combining these, communication around the DEPR is much clearer. We can say, "in Sumac, you can run MFEs on Node 18 or Node 20. After Sumac MFEs will no longer support Node 18."
  • Drop Node 18 support in edx-platform
    • The edx-platform repository is both large enough and isolated enough to warrant a standalone DEPR ticket.

Addendum: Unmaintained repositories

Throughout any given DEPR process, there are likely repositories that fall under the DEPR where implementation has not happened. Once the "react" window of a DEPR has concluded and the DEPR has reached the "contract" stage, we should not delay implementation in those repositories.

For example, once we have reached the "contract" stage of the "Drop Node 18 support for Tutor-supported MFEs and supporting libraries" DEPR, updating an MFE or library that didn't "make the cut" should not require writing a new DEPR ticket.

So instead of:

  • We complete the DEPR dropping Node 18 support for Tutor-supported MFEs and supporting libraries
  • We find frontend-app-old_and_unused hasn't been updated to support Node 20 and drop Node 18 support
  • We create a new DEPR ticket to drop Node 18 support in frontend-app-old_and_unused
  • We "expand" frontend-app-old_and_unused to support Node 20 and Node 18
  • We wait 6 months in the "react" phase before dropping Node 18 support

We should:

  • We complete the DEPR dropping Node 18 support for Tutor-supported MFEs and supporting libraries
  • We find frontend-app-old_and_unused hasn't been updated to support Node 20 and drop Node 18 support
  • We create a PR to frontend-app-old_and_unused adding Node 20 support and dropping Node 18 support
    • We reference the completed DEPR in the PR comment

This strategy will allow people looking to update unmaintained repositories to do so without putting DEPR process roadblocks in place.

@kdmccormick
Copy link
Member Author

At the Maintenance WG meeting on 2024-09-26, we (@feanil @jristau1984 @brian-smith-tcril @robrap ) clarified this part of the pilot:

All consumers of edx-platform should have a 6 month window to handle breaking changes, including both folks using named releases and folks using main.

Rather than defaulting to 6 months of overlapping support between old and new features, we decided that we should default to 6 months of advance communication, including at least 1 month window of overlapping support between old and new features.

@robrap
Copy link
Contributor

robrap commented Oct 2, 2024

Rather than defaulting to 6 months of overlapping support between old and new features, we decided that we should default to 6 months of advance communication, including at least 1 month window of overlapping support between old and new features.

Noting that this came up in response to a discussion specifically about upgrades. The thing about upgrades is that we have specific dates we need to hit, and the replacement may miss the target by some amount of time, but it shouldn't be an unusually large amount of time. That means that hopefully organizations can realistically plan resourcing at the right time, or near the right time.

I'd like to differentiate this from any old feature DEPR. Although we have agreed separately that we wouldn't announce a DEPR that doesn't have a real plan for replacement, I want to find some way to avoid resourcing problems on a DEPR that misses the target by an additional X months, and then organizations now have 1 month to re-find the resources, rather than the original 6-month planning window. Does that make sense? In other words, I want to avoid a situation where orgs are given a 6 month warning, and the replacement takes 1 year, and for the second 6 months orgs are on the hook for possibly having to do the work next month, month after month.

@feanil
Copy link
Contributor

feanil commented Oct 3, 2024

In other words, I want to avoid a situation where orgs are given a 6 month warning, and the replacement takes 1 year, and for the second 6 months orgs are on the hook for possibly having to do the work next month, month after month.

This makes sense, do you have an alternate proposal in mind for this? I would imagine that as we find that work will not complete on time, we re-adjust the target dates as best we can and communicate them as early as possible? Though there are always some projects that are "almost done" for 6 months due to their complexity.

@robrap
Copy link
Contributor

robrap commented Oct 4, 2024

This makes sense, do you have an alternate proposal in mind for this?

The original pilot was that orgs have 6 months from a replacement being available, which doesn't have this concern of sliding dates, because the replacement is already available.

One proposal is to limit the scope of this proposed change to upgrades. Here are some things that are unique to upgrades:

  1. There is usually some outside deadline which limits how much dates are likely to slide.
  2. There is a higher cost of a long overlapping periods between old and new (e.g. CI cost).

We can also bounce around ideas when we all meet together again.

@robrap
Copy link
Contributor

robrap commented Oct 4, 2024

@kdmccormick: I just saw your comment on openedx/public-engineering#247 (comment), and we really need more discussion, including @jristau1984. It feels like the switch from 6 months between expand/contract (in this PR description), and now suddenly a 1 month expand/contract (in your PR comment) is not addressing all of the original issues around planning.

On the devstack related work, the team is working in that area, so I imagine that we can get a quicker turnaround. That said, it does raise questions about the process. It could have read: "The replacement will be available around X date, and you'd have 6 months to respond, but I'm hoping for 1 month. Is that doable?". This is how I would imagine this type of conversation would go from the original pilot, but now we've leaped to: "This will get done at some point and then you have 1 month.", which feels very different. [And maybe there were conversations that I didn't see.]

@robrap
Copy link
Contributor

robrap commented Oct 10, 2024

Update:
We're back to the original proposal:

Rather than defaulting to 6 months of overlapping support between old and new features, we decided that we should default to 6 months of advance communication, including at least 1 month window of overlapping support between old and new features.

Additionally, the ability to negotiate dates is an explicit part of the process. This could include adjusting the default dates for a specific ticket, or negotiating extensions as-needed (e.g. difficulties that arise, or too many maintenance requests landing at the same time, etc.).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Proposed
Status: In Progress
Development

No branches or pull requests

5 participants