Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Make each commit a working version and have linear history. #183

Closed
alloy opened this issue Apr 11, 2019 · 44 comments
Closed

[RFC] Make each commit a working version and have linear history. #183

alloy opened this issue Apr 11, 2019 · 44 comments
Assignees
Labels

Comments

@alloy
Copy link
Contributor

alloy commented Apr 11, 2019

Proposal:

For each commit in a repo’s mainline branch (a.k.a master branch) to theoretically be deployable; as in, the test suite/linter/etc are all green.

And for related changes to be a single commit or next to each-other in the commit history.

Reasoning

Some of us work more often with a repo’s commit history, either to debug a software issue or otherwise gain information about why certain changes were made. This is made difficult when something that could semantically be considered a single change is spread out over multiple commits.

Squashing before merging means that all the related changes exist in a single commit and thus in context of each other and the commit message.

Each commit a working version

Multiple commits is problematic when trying to find a commit that introduces a change you’re looking for through tools such as git bisect, as there’s the opportunity to come across commits that are known to be broken (which are typically followed up in the PR with a ‘fix‘ commit) and only interrupt the workflow.

Understanding historical decisions

A git merge may result in commits being interleaved with changes from other PRs, which is problematic when scanning the history for changes you may be interested in and having to keep a mental model of all the commits you came across. Especially those that have a commit message such as ‘fix’ only add noise and make this harder than it needs to be.

Exceptions:

In some cases, a PR may contain a larger set of changes that may not all be directly related to each other. In these cases the author should take care to squash the directly related changes into single commits and then request the assignee to ‘Rebase and merge’ instead, which will ensure they will exist next to each-other in the destination branch but otherwise leaves them in tact.

Additional Context:

Some reading material related to topics mentioned:

How is this RFC resolved?

  • Offer some form of training on git merge vs rebase basics. (Short L&L?)

  • Disable the ’Merge button‘ on repos.

  • By default, ‘Squash and merge’ PRs, unless otherwise requested by adding the ‘Rebase and merge’ label.

    Additionally an assignee may be able to glance from the commit history that the few commits are each their own distinct change and can decide to ‘Rebase and merge’ without the author explicitly asking for it.

  • Peril’s ‘Merge on green’ feature uses the label to choose either ‘Squash and merge’ or ‘Rebase and merge’.

@alloy alloy added the RFC label Apr 11, 2019
@alloy alloy changed the title [RFC] Make each commit a working version and have a linear history. [RFC] Make each commit a working version and have linear history. Apr 11, 2019
@alloy
Copy link
Contributor Author

alloy commented Apr 11, 2019

Some practical information on how to rewrite your commits https://nl.atlassian.com/git/tutorials/rewriting-history

@dleve123
Copy link
Contributor

@alloy is your main goal here to avoid merge commits, or to ensure that there are fewer “WIP”/otherwise poorly named commits in a Git history, or both?

I do think defaulting to a “squash and merge” approach is a useful blunt hammer to reduce “noise” commits, but developing good Git hygiene requires some serious Git-fu (knowing when and how to rebase is a hard thing).

This is all to say that I think some training on rebasing could also be part of resolving this RFC.

@alloy
Copy link
Contributor Author

alloy commented Apr 11, 2019

@alloy is your main goal here to avoid merge commits, or to ensure that there are fewer “WIP”/otherwise poorly named commits in a Git history, or both?

I’m fine with merge commits. I foremost want commits in mainline to be working versions, so we can use tooling like git bisect without jumping though unnecessary hoops. Second I would appreciate commit messages that provide good context for the change.

It’s a nuanced thing, but I don’t particularly care about hygiene for the sake of it. This RFC is about practical improvements.

This is all to say that I think some training on rebasing could also be part of resolving this RFC.

I don’t think most people will need to do any fancy rebasing, other than normal. But I’m certainly open to doing a (short) L&L on the topic! 👍

@alloy
Copy link
Contributor Author

alloy commented Apr 11, 2019

Here’s an example of a ‘Squash and merge’ commit vs the original 25 commits.

@damassi
Copy link
Member

damassi commented Apr 11, 2019

A lot of (particularly newer) devs are understandably intimidated by rebasing, and in particular dealing with merge conflicts, but this is just a matter of education. If we can help ensure that devs know they can rewind (via something like reflog) and not feel trapped inside of history rewrites then I'm 👍. (I've tried to teach rebases to a few devs in the past found that 100% of the time it seems scary until it's not.)

@ashfurrow
Copy link
Contributor

@alloy that example you linked to makes a really strong case for squash-and-merge as the default option for merging PRs. The 25 commits include several merge commits too, which are also filtered out by squash-and-merge. It effectively turns PRs into those semantic, atomic changes you describe.

I'm a big 👍 on this, and having more resources to improve git skills in general, but I'm unclear why we would need to get everyone to switch to rebasing from master instead of merging? It seems like they get filtered out of history by squash-and-merge.

@alloy
Copy link
Contributor Author

alloy commented Apr 12, 2019

@ashfurrow I think that only worked out this way because the list of commits ended up being a fast-forward, but I could be wrong about that and I also don’t necessarily know what work GH does under the hood to perform its squash. So perhaps I’ll just drop that part from the RFC, since it was mostly meant as a suggestion to ease work on the dev for longer running branches, which I think is not the norm.

@zephraph
Copy link
Contributor

I'm 👍 on squash and merge. I generally default to this already for my own PRs just because I tend to do a lot of WIP commits.

Merge commits seem to work fine. I pretty much never rebase and I haven't had any issues with squash merges to this point.

@alloy
Copy link
Contributor Author

alloy commented Apr 12, 2019

Good to know, thanks @zephraph!

@jonallured
Copy link
Member

I'm totally a plus one on this and I'm pumped that we're discussing this type of thing!!

It appears you are interested in removing the part about squash and merge, which is great because that was the part I was going to push back on as well. I like to artisanally craft my commits and I think there's value in having them outlive their PR so prefer the rebase and merge button. I def strive for each commit to be deployable (and green!) as I have seen the value of being able to bisect in a sane way.

I'll also note too that having a linear history is such a nice thing to work with when digging through commits. So much less brainpower - here's a visual:

Screen Shot 2019-04-15 at 9 31 12 AM

Volt on the left and RubyConferences.org on the right where I enforce rebase and merge.

@joeyAghion
Copy link
Contributor

I'm in favor of each git commit corresponding to a theoretically working version. But each commit isn't necessarily releasable; this, in my opinion, is why a pull request is [and should remain] a higher-order concept. I'd much much rather developers work to eliminate crufty commits than we lose that resolution altogether by squashing every PR.

@alloy
Copy link
Contributor Author

alloy commented Apr 15, 2019

It appears you are interested in removing the part about squash and merge

@jonallured I wasn’t really, I was interested in removing the part about how to best pull in upstream changes into your own branch (and have already done so). For people like you that already make sure to have each commit be a working version and standalone change my suggestion is listed under ‘Exceptions’. My sense was that there’s less people that do this and that the least impact on people in the team would be to make ‘squash and merge’ the default and have people like you and me add the label to not squash. Do you think that makes sense and do you have a suggestion for clarifying that in the RFC (seeing as that was clear enough to you)?

@alloy
Copy link
Contributor Author

alloy commented Apr 15, 2019

@joeyAghion Me too, but that would definitely require people to better understand how to use git and tools like rebase–which I assumed might be too hard of a requirement. I’m definitely open to having my assumption changed, though.

@starsirius
Copy link
Member

Yeah I definitely love the git history to be clean linear and relevant ones next to each other. But I'm +1 to @jonallured and @joeyAghion's points here that having the commit-level resolution in the history can be useful. Crafting the individual commits in a PR helps me think about how to tackle a ticket incrementally, and it's also very useful to be able to refer back to particular clean commit without overwhelming information. That said, I personally prefer "rebase and merge" than "squash and merge". But I understand we will need to be comfortable with rebasing.

@alloy
Copy link
Contributor Author

alloy commented Apr 15, 2019

Right. I think we all agree on that being the ideal default situation, but I’d like to hear if y’all have thoughts on if we can ask that from everybody to actually be the default?

@jonallured
Copy link
Member

jonallured commented Apr 15, 2019

Consider the commits from these two pretend PRs:

PR A

* abcdef Extract class for Foo
* qewrty Wrap tests around Foo
* oophda Add ability Bar to Foo

PR B

* abcdef Add ability Bar to new class Foo
* qewrty Fix tests
* oophda Run prettier

For PR A, I believe it would be a shame to lose the three commits that make up that work. In this case, I'd rather rebase and merge. This is debatable of course and somewhat a matter of style and or taste.

For PR B, what I really want is to help this author re-think their commit strategy, but in this case, a squash and merge is totally acceptable. Nothing is gained by keeping these commits separate. This is the whole point of this RFC (correct me if I'm wrong @alloy!). I think the value of these individual commits being pretty small is not debatable but feel free to shout if you disagree!

So I'm torn - my preference is to err on the side of rebase and merge as the default and then work with people to get them used to that style. But that does take more buy-in than moving to squash and merge.

@alloy
Copy link
Contributor Author

alloy commented Apr 15, 2019

I want to re-clarify that my proposal allows for a systematic way of requesting an assignee to leave your commits in tact by applying the ‘Rebase and merge’ label. Only when no label is applied would the default action be to ‘Squash and merge’.

After accepting and implementing this first step, we could absolutely work towards the default being ‘Rebase and merge’, but for now I feel like that is putting too much burden on those among us that feel less comfortable with reworking their git history.

@dleve123
Copy link
Contributor

dleve123 commented Apr 15, 2019

After accepting and implementing this first step, we could absolutely work towards the default being ‘Rebase and merge’, but for now I feel like that is putting too much burden on those among us that feel less comfortable with reworking their git history.

Big 👍 to this – I feel like the original problem to solve is pragmatically getting all devs, regardless of Git-level, to create Git history that's high signal:noise (aka useful commits), so defaulting to "squash and merge", but allowing others to perform a "rebase and merge", best solutions around that problem.

@jonallured
Copy link
Member

my proposal allows for a systematic way of requesting an assignee to leave your commits in tact

I'm up for trying it, but I have concerns that our best intentions here won't be enough and clicking the green button will be what people are used to. 🤷‍♂️

@zephraph
Copy link
Contributor

That's what peril is for 😉

Sent with GitHawk

@anandaroop
Copy link
Member

Late, and mostly affirming what others have said above —

I'm very much in favor of retaining higher-resolution, meaningful commits for authors who practice that style. This has been extremely valuable to me not only during PR review, but also when doing code archaeology months or years after the fact.

That said, I too have felt the pain of a broken git bisect workflow, so I am cautiously 👍 on this.

My only worry is that handling the author's preferred merge strategy through GH labels could prove to be brittle (fine for automated "merge on green" commits, but possibly subject to error otherwise).

Willing to give it a shot though!

@orta
Copy link
Contributor

orta commented Apr 18, 2019

FWIW: Peril cannot change the type of button which will show (it's not in the edit PR API).

So having two separate options which someone will have to choose from each time they merge a PR will be confusing. Especially if they have to go back, look at the commits and decide those commits are good for the history and not worth squashing. It really might be worth simplifying to "always use squash" because that's MVP

Peril can decide what "merge on green" means in terms of merge/squash/rebase - but I'm not sure if moving to always use Peril to merge every PR is a good pattern overall (mainly because Peril isn't perfect and occasionally goes down)

@alloy
Copy link
Contributor Author

alloy commented Apr 18, 2019

I’m open to adding a Chrome/Safari extension to the list of requirements that changes the merge button based on the label.

@mbilokonsky
Copy link
Contributor

Weighing in very late but yes, I tend to use a lot of WIP commits (because I switch computers depending on if I'm WFH or in the office) and don't always have a clean working-commit boundary when I need to commit - so in my mind I tend to assume we are going to squash on merge but we never do.

Personally I think a system that collapses PR's into single commits by default is preferable to expecting individual commit management at such a granular even - so I very much 👍 this RFC. Having the option not to, for folks who practice better git hygiene, is obviously crucial too - but this sounds to me like a very sensible default.

@zephraph
Copy link
Contributor

I don't think a chrome/safari extension is necessary. Here's a potential solution:

  1. Set squash to be the default. It's a good default.
  2. Add a rebase label.
  3. When the rebase label is present on a PR, block merge with a status check from peril saying it should be merged with rebase.
  4. Update merge on green or add a new rule to merge the pr in the correct way.
  • Admin privileges allow PRs to be merged when checks aren't green so the default behavior can be overridden. Still, it should give folks pause to check what the state of the PR is before merging.
  • If peril is down and the PR isn't blocked we still at least have the label. It at least gives more of a visual indicator of how to merge and a little training around the label should cover us for most cases. This is a corner case though, and many other things wouldn't work when peril is unavailable.

@dleve123
Copy link
Contributor

dleve123 commented May 13, 2019

Agreed that we don't need to build a chrome extension. I think just removing to perform a "simple" PR merge button would add a lot of value.

What's needed to resolve this RFC and implement some changes? @alloy I'm happy to prototype out some of this on Auctions systems to see how this works in practice.

@alloy
Copy link
Contributor Author

alloy commented May 15, 2019

@dleve123 I'm looking into how best add the label to each repo and update the merge settings, but for now it would be great if you could test-drive it 👌

  1. Add a “Rebase and Merge” issue label for people to request keeping their commits in tact.
  2. Uncheck the “Allow merge commits” option from the “Merge button” section on https:/artsy/[repo]/settings

@zephraph
Copy link
Contributor

Oh, one thing I ran into yesterday. @damassi had a pr with a few well written commits so I decided to use rebase and merge. This doesn’t, so far as I can tell, give you a mechanism to revert all the commits. Seems like reverting a rebased PR could be painful.

Sent with GitHawk

@alloy
Copy link
Contributor Author

alloy commented May 20, 2019

@zephraph If those changes are so inter-related that together they make up one holistic change, then perhaps that shouldn’t have been separate commits?

@alloy
Copy link
Contributor Author

alloy commented May 20, 2019

@dleve123 Have you been able to give this a try?

I'm thinking of enabling this process on MP and based on that create a Peril rule to create the labels and update the repo settings on each Artsy org repo.

@alloy
Copy link
Contributor Author

alloy commented May 20, 2019

Done, this is now enabled on MP.

@alloy alloy self-assigned this May 20, 2019
@alloy
Copy link
Contributor Author

alloy commented May 20, 2019

Ok, Peril is going to keep adding back the RFC label. I'll just close this as accepted and will follow-up with resolution steps.

@alloy alloy closed this as completed May 20, 2019
@joeyAghion
Copy link
Contributor

Sorry for coming back to this once resolved, but now that I see how this may actually be implemented I feel greater reservations: I do not think we should encourage squashing by default. This leads to the wrong results in the good/common case where an engineer has thoughtfully composed discrete commits.

Rebase-merging is a better alternative but brings other problems with it, such as resulting in different code than may have passed the build.

I think our workflow should encourage good commit-writing and respect those commits through PR merges.

@alloy
Copy link
Contributor Author

alloy commented May 20, 2019

I hear you, but even if it were truly common, which I have my doubts about, it still doesn’t solve for all cases. I would prefer to optimise the cases where somebody needs to work with the source after the fact over artisanal commit messages–which should be preserved btw in a squash and merge.

@damassi
Copy link
Member

damassi commented May 20, 2019

I'm a bit late to this, but thinking: building a culture around conscious committing / reworking history before merging might be a net win over disabling merge commits. It's pretty common to link back to a specific commit to show when something specific was completed. I've observed the Galleries team's thoughtful approach to commit messages and how they contain units of work. That's something cultural, rather than enforced.

@zephraph
Copy link
Contributor

Rounding back up on this. Today @alloy disabled merge commits on metaphysics but we also enabled PR based releases. Unfortunately these two changes aren't compatible.

See the slack conversation for more context.

The TL;DR for anyone following is that squashing creates a new commit (and removes the old) which means staging and release would always be out of date.

I think looking forward we'll need to derive a new release process to be able to have merge commits disabled on a project with them.

FWIW I'm still 👍 with where this RFC settled out (I already did Squash and merge most of the time anyway).

@alloy
Copy link
Contributor Author

alloy commented May 21, 2019

@zephraph I actually think that ‘rebase and merge’ should work, there shouldn’t be any rewritten commits if 2 branches are identical in history.

@joeyAghion
Copy link
Contributor

The more I think about this, the more I really don't think we should obscure commits by squashing them by default. Commits are the most basic unit of git, which is perhaps the most elegant piece of software in existence. Commits and pull requests are different things: a commit represents a coherent set of changes while a pull request represents a chance to integrate that work into a new version or environment. Conflating commits with pull requests risks interfering with the loads of powerful tools and affordances designed to work on commits: diffing, rewriting, squashing, rebasing, interactive rebasing, commenting, bisecting, authorship.... Adapting these to a workflow where master contains completely different commits from our working branches introduces additional cognitive overhead at least or an impossible amount of custom tooling at worst.

I understand the idealized vision of our shared master containing only discrete, working, changesets (almost like a changelog), but I'm not sure that's achievable even if we did squash all PRs (given the frequency of closely-coupled PRs like build- or bug-fixes).

I still agree that our commits could be much higher quality and rebasing/squashing be used more frequently to refine a PR's commits prior to merging. This is very valid PR feedback in my opinion.

</opinion>

@jonallured
Copy link
Member

I think I'd rather have non-linear history with the commits broken out than linear history with commits squashed. Of course my actual preference is linear history with commits broken out, or rebase and merge but there seems to be concerns on that one. 🤷‍♂

@damassi
Copy link
Member

damassi commented May 22, 2019

One Q I have is: how often, as a whole, do people spend traveling back in time debugging. I understand this is a somewhat shady argument. Just trying to point out that there might be a balance to be struck, and while it would be good to think of debugging a working app as first class, we could be losing something in the process of supporting an aim that seems somewhat rarer and somewhat specific to only a subset of the group, and only arises under certain conditions.

(I understand “conditions” to mean, everything is burning — including money. Just saying…)

Going from the assumption that tests are green when a PR is merged (and app is thus runnable), it seems fairly easy to jump back to that commit, and the commits in between, based upon PR merge points using something like the technique described in @anandaroop's link here*, while keeping all of the various metadata intact.

@alloy
Copy link
Contributor Author

alloy commented May 22, 2019

My guess is that I end up doing it more often than others. If the majority feels stronger about keeping commits in tact then so be it.

@alloy
Copy link
Contributor Author

alloy commented May 22, 2019

To be clear, I’m not conflating PRs with commits. I personally take a lot of effort in crafting commits, but I too fail at defining the right boundaries every now and then, leading to some commits on their own being broken. We all do. Asking people to review wether or not the commits in a PR are all drawing their boundaries around the right changes is imo asking too much. So:

  1. we’d ask reviewers to do more work, and;
  2. there’s still potential for people that need to work with history to have to deal with broken commits.

As a pragmatic person, that doesn’t seem like a win in any way, to me. 🤷‍♂

I don’t worry about tooling in general, companies like Facebook and Google all squash before they merge and they rely on tooling a lot more than we do. (This is aside from historical decisions we’ve already taken, such as the PR based deploy process issue.)

To summarise, a ‘squash and merge’ on GitHub will:

  • get all the carefully crafted messages
  • link to the PR it was created from
  • be known to have passed our CI checks in its entirety

And when you have changes that really should be its own thing, then you can still create a separate PR.

I understand the idealized vision of our shared master containing only discrete, working, changesets (almost like a changelog), but I'm not sure that's achievable even if we did squash all PRs (given the frequency of closely-coupled PRs like build- or bug-fixes).

That’s a different issue and unrelated to this RFC. For instance, I’m not asking people to no longer introduce bugs–that would be a weird RFC. What I’m asking is that we limit the amount of time that somebody, who does need to work with history, has to spend in comparison to how much it costs us to achieve that.

@joeyAghion
Copy link
Contributor

Very interesting re. Facebook and Google. 🤔 (I guess they don't depend on diff-ing histories the way I do, or have the bulk to work around it.)

I agree that our PR-driven deploys are a separate issue and not necessarily deal-breakers.

Also, just noting that it's critical for PRs to undergo full CI for this to work. This is always a good thing, but not currently the case for Gravity.

@alloy
Copy link
Contributor Author

alloy commented May 23, 2019

Also, just noting that it's critical for PRs to undergo full CI for this to work. This is always a good thing, but not currently the case for Gravity.

Ahh. I was under the impression that we ran all tests for PRs, but distributed over multiple containers?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests