120

My office is trying to figure out how we handle branch splits and merges, and we've run into a big problem.

Our issue is with long-term sidebranches -- the kind where you've got a few people working a sidebranch that splits from master, we develop for a few months, and when we reach a milestone we sync the two up.

Now, IMHO, the natural way to handle this is, squash the sidebranch into a single commit. master keeps progressing forward; as it should - we're not retroactively dumping months of parallel development into master's history. And if anybody needs better resolution for the sidebranch's history, well, of course it's all still there -- it's just not in master, it's in the sidebranch.

Here's the problem: I work exclusively with the command line, but the rest of my team uses GUIS. And I've discovered the GUIS don't have a reasonable option to display history from other branches. So if you reach a squash commit, saying "this development squashed from branch XYZ", it's a huge pain to go see what's in XYZ.

On SourceTree, as far as I'm able to find, it's a huge headache: If you're on master, and you want to see the history from master+devFeature , you either need to check master+devFeature out (touching every single file that's different), or else scroll through a log displaying ALL your repository's branches in parallel until you find the right place. And good luck figuring out where you are there.

My teammates, quite rightly, do not want to have development history so inaccessible. So they want these big, long development-sidebranches merged in, always with a merge commit. They don't want any history that isn't immediately accessible from the master branch.

I hate that idea; it means an endless, unnavigable tangle of parallel development history. But I'm not seeing what alternative we have. And I'm pretty baffled; this seems to block off most everything I know about good branch management, and it's going to be a constant frustration to me if I can't find a solution.

Do we have any option here besides constantly merging sidebranches into master with merge-commits? Or, is there a reason that constantly using merge-commits is not as bad as I fear?

Zanon
  • 329
  • 2
  • 3
  • 16
Standback
  • 1,310
  • 2
  • 9
  • 14
  • 84
    Is recording a merge as a merge really that bad? I can see that squashing into a linear history has its advantages, but I'm surprised that not doing so goes against "most everything you know about good branch management". What exactly are the problems that you are afraid of? – IMSoP Dec 19 '16 at 15:47
  • 2
    Possibly related discussion (note that the answers do not say that squashing is universally correct): http://softwareengineering.stackexchange.com/questions/263164/why-squash-git-commits-for-pull-requests/263172#263172 – IMSoP Dec 19 '16 at 15:51
  • @IMSoP: It's a question of scope. Plenty of merge-commits are just fine. But *some* - particularly long-term parallel development - mean you're introducing a bunch of parallel commits. Makes the history one of constant parallel branches, so it's always hard to figure out what's from where, or narrow things down to a state where " `master`'s state on June 4th was so-and-so." My problem isn't judicious use of merge-commits; it's the inability to squash and diverge when that would make more sense. – Standback Dec 19 '16 at 16:01
  • 65
    OK, but in my mind a long-running branch is exactly where you *do* want some visibility, so that blames and bisects don't just land on "the change was introduced as part of the Big Rewrite of 2016". I'm not confident enough to post as an answer, but my instinct is that it's the *tasks within the feature branch* that should be squashed, so that you have a short-ish history of the project accessible from the main history, without having to check out an orphaned branch. – IMSoP Dec 19 '16 at 16:07
  • If the history had been made short-ish, I'd agree entirely. They want FULL history. (Nor are they spending time packaging up their commits into more manageable chunks. We're working on it, but it's a process.) – Standback Dec 19 '16 at 16:10
  • If you've described it to them like you've described it to us, maybe that's where the misunderstanding lies. Your wording implies the opposite: that you don't care how verbose the history on the feature branch is, but want master to be perfectly linear. – IMSoP Dec 19 '16 at 16:13
  • We're all willing to accept any solution that actually works. It's more that, *given* that their development history is messy, squashing seems to me like my only option here. Saying "merges are fine, but on the condition that you all gain three levels in `git rebase`," sounds like an idyllic solution but not a very practical one :-/ – Standback Dec 19 '16 at 16:24
  • 5
    That being said, it's quite possible thus question reduces to "I am trying to use git at a higher level than my teammates do; how do I keep them from messing that up for me." Which, fair enough, I just honestly didn't expect `git log master+bugfixDec10th` to be the breaking point :-/ – Standback Dec 19 '16 at 16:26
  • 27
    "long-term sidebranches -- the kind where you've got a few people working a sidebranch that splits from master, we develop for a few months, and when we reach a milestone we sync the two up." Do you not periodically pull from master into side branch? Doing that every (few) commits to master tends to make life simpler in many cases. – TafT Dec 20 '16 at 08:32
  • 1
    @TafT : That would be a much better situation. :) But, no, we've got cases where there are major changes on master that would destabilize the side branch, or require a major effort for merge, and the decision was to let them diverge significantly, and make one concentrated merge effort at a milestone. Everything's easier when you're close to master-branch, but unfortunately we often are not :-/ – Standback Dec 20 '16 at 09:06
  • 2
    @Standback your situation is the common one I come across when using Feature-Branches for anything that takes more than a man week of effort in a large company. Every feature-branch integration can become a major change that destabilizes the side branches. We all have to work with what we have today and contemplate how we might make tomorrow easier. – TafT Dec 20 '16 at 09:22
  • 4
    Is `merge --no-ff` what you want? On `master` itself you have one commit that describes what changed in the branch, but all of the commits still exist and are parented to HEAD. – CAD97 Dec 20 '16 at 16:49
  • How about training your teammates on the command line? – Michael Hampton Dec 20 '16 at 17:53
  • 1
    You never explain what you mean by "infinite." – jpmc26 Dec 21 '16 at 01:14
  • @MichaelHampton : Rest assured, I'm proselytizing for the command line cheerfully, helpfully, and mercilessly :) – Standback Dec 21 '16 at 07:49
  • @jpmc26 : I'll cop to some hyperbole with "infinite." :P Beyond the scope of the question, but: my team isn't very good at going beyond the default "pull/push"; our commit histories are pretty messy as it is. *Two* major messy branches merged side-by-side makes things less navigable (for me; working on it), and projects and branches multiply, so it's absolutely not going to stay at two... – Standback Dec 21 '16 at 07:55
  • @CAD97 : I'm sorry, that's not clear to me. `--no-ff` forces a merge commit even if a fast-forward merge was possible. But if you're merging with `--no-ff` into `master`, everything you merged in becomes included in `master`'s history. Right after your merge, `HEAD` and `master` will be pointing to the same commit. – Standback Dec 21 '16 at 07:58
  • @Standback the point of `--no-ff` is that only the merge commit will be "on `master`", so viewing the history of `master` alone will show just the merge commit. In this I will defer to the other answers who know more than I do. – CAD97 Dec 21 '16 at 16:19
  • 1
    I don't understand this: "we're not retroactively dumping months of parallel development into `master`'s history..." Isn't that *exactly* what you're doing, whether or not the commits you create reflect that reality? Maybe you just meant that `--squash` would prevent that history-dump from showing up in `git log` on `master`, but even in that case I'd suggest that a single massive commit is going to be hard to understand down the road. – Kyle Strand Dec 22 '16 at 23:01
  • @KyleStrand : That's an excellent point, and absolutely correct. As I see it, it's a tradeoff between reflecting the precise order of real events, and being able to follow long individual strands of development as being *independent,* isolated from the parallel changes -- which is also how that development really took place. I also see the commit history of `master` as ultimately "here is how the *master* branch changed" -- which is also much more easily expressed by a master branch that moved forwards, rather then "master existed in multiple states in parallel." – Standback Dec 25 '16 at 06:22
  • Personally, I feel like single-massive-commits are a fairly minor issue, **on the condition** that they clearly reference a side-branch which *does* have the full history. That's because upon finding the squash-commit, I just repeat whatever-it-was-I-was-doing, this time on the side branch -- which, in the command line, is trivial. I'm growing to understand, though, that this can be a very unpopular workflow :P – Standback Dec 25 '16 at 06:25

7 Answers7

245

Even though I use Git on the command line – I have to agree with your colleagues. It is not sensible to squash large changes into a single commit. You are losing history that way, not just making it less visible.

The point of source control is to track the history of all changes. When did what change why? To that end, every commit contains pointers to parent commits, a diff, and metadata like a commit message. Each commit describes the state of the source code and the complete history of all changes that led up to that state. The garbage collector may delete commits that are not reachable.

Actions like rebasing, cherry-picking, or squashing delete or rewrite history. In particular, the resulting commits no longer reference the original commits. Consider this:

  • You squash some commits and note in the commit message that the squashed history is available in original commit abcd123.
  • You delete[1] all branches or tags that include abcd123 since they are merged.
  • You let the garbage collector run.

[1]: Some Git servers allow branches to be protected against accidental deletion, but I doubt you want to keep all your feature branches for eternity.

Now you can no longer look up that commit – it just doesn't exist.

Referencing a branch name in a commit message is even worse, since branch names are local to a repo. What is master+devFeature in your local checkout might be doodlediduh in mine. Branches are just moving labels that point to some commit object.

Of all history rewriting techniques, rebasing is the most benign because it duplicates the complete commits with all their history, and just replaces a parent commit.

That the master history includes the complete history of all branches that were merged into it is a good thing, because that represents reality.[2] If there was parallel development, that should be visible in the log.

[2]: For this reason, I also prefer explicit merge commits over the linearized but ultimately fake history resulting from rebasing.

On the command line, git log tries hard to simplify the displayed history and keep all displayed commits relevant. You can tweak history simplification to suit your needs. You might be tempted to write your own git log tool that walks the commit graph, but it is generally impossible to answer “was this commit originally committed on this or that branch?”. The first parent of a merge commit is the previous HEAD, i.e. the commit in the branch that you are merging into. But that assumes that you didn't do a reverse merge from master into the feature branch, then fast-forwarded master to the merge.

The best solution to long-term branches I've encountered is to prevent branches that are only merged after a couple of months. Merging is easiest when the changes are recent and small. Ideally, you'll merge at least once per week. Continuous integration (as in Extreme Programming, not as in “let's set up a Jenkins server”), even suggest multiple merges per day, i.e. not to maintain separate feature branches but share a development branch as a team. Merging before a feature is QA'd requires that the feature is hidden behind a feature flag.

In return, frequent integration makes it possible to spot potential problems much earlier, and helps to keep a consistent architecture: far reaching changes are possible because these changes are quickly included in all branches. If a change breaks some code, it will only break a couple of days work, not a couple of months.

History rewriting can make sense for truly huge projects when there are multiple millions lines of code and hundreds or thousands of active developers. It is questionable why such a large project would have to be a single git repo instead of being divided into separate libraries, but at that scale it is more convenient if the central repo only contains “releases“ of the individual components. E.g. the Linux kernel employs squashing to keep the main history manageable. Some open source projects require patches to be sent via email, instead of a git-level merge.

amon
  • 132,749
  • 27
  • 279
  • 375
  • Do I understand correctly that you see *any* history-rewriting as undesirable (except large projects, etc etc)? Like, if a developer made a bunch of quick-and-dirty commits on a feature , none of which makes sense on its own but only as a group, you wouldn't want them to clean that up into a friendlier, more readable commit? – Standback Dec 19 '16 at 16:55
  • 44
    @Standback I don't care what a developer does locally … use “wip” commits, extra commits for each fixed typo, …. That's fine, it's better to commit too often. Ideally, the developer cleans up those commits before they are pushed, like combining all commits that just fix some typos. It's still a good idea to keep commits separate if they do different things, e.g. one commit for actual functionality and related tests, another for unrelated typos. But once the commits are pushed, rewriting or deleting history is more trouble than it's worth, at least in my experience. – amon Dec 19 '16 at 17:09
  • There's also the question how the history is read – I look at recent commits from all branches to see what others are working on, select some commits to see what I would be pushing, and mainly use `git annotate` to figure out why this particular piece of code was written that way. That means that for my use cases, any complicated history is largely invisible. – amon Dec 19 '16 at 17:09
  • 11
    "it is generally impossible to answer “was this commit originally committed on this or that branch?"" I see the fact that "branches" are just pointers to commits with no real history of their own as one of the biggest design flaws in git. I really want to be able to ask my version control system "what was in branch x on date y". – Peter Green Dec 19 '16 at 18:40
  • 1
    But if you merge the squashed commit with the original long history, won't you get the best of both worlds? It'll have two parents, one a single big change and the other a series of small mistakes. – Owen Dec 19 '16 at 18:56
  • Thanks for the awesome answer! I view source-control somewhat differently then you describe -- not only "who changed what lines when," but a tool for managing changes to the codebase, each (ideally) with its own significance; each (ideally) making good sense, in context. I also find git very powerful for comparing different points in history (meaningful commits == meaningful comparisons), for treating commits as patches, and other things that make clean histories super-helpful. All that being said, it's not all about me. :P And Karl's answer helps me see how merges are workable for me, too. – Standback Dec 19 '16 at 21:04
  • @Standback: I tend to view "meaningful commits" with reference to the mantra "commit early, commit often". Things are meaningful if they implement a behaviour required by a feature. So a meaningful commit to me is often two lines in total but can go as large as around a dozen lines. More than that and it's no longer the smallest meaningful commit but a single large commit with multiple meanings. It's kind of like TDD - every green cycle (code passes all expected tests up to that point) is an opportunity to commit (though I don't usually do that because it breaks the TDD rhythm) – slebetman Dec 19 '16 at 22:01
  • 84
    Nothing is more frustrating that trying to identify the cause of a bug in the code using `git bisect`, only to have it arrive at a 10,000 line uber-commit from a side branch. – tpg2114 Dec 20 '16 at 01:14
  • 2
    @Standback IMHO the smaller the commit is the more readable and friendly it is. A commit that touches more than few points is impossible to grasp at first look, so you just take the description at face value. That's readability of the description (eg "implemented feature X"), not the commit (code) itself. I'd rather have a 1000 of one-letter commits like "changed http to https" : ) – Agent_L Dec 20 '16 at 11:23
  • 1
    If only commits could be grouped, and the group given a comment, with most tools only reporting at the group level unless you asked for more details..... – Ian Dec 21 '16 at 17:29
  • 2
    cherry-pick doesn't rewrite or delete history. it just creates new commits which replicate changes from existing commits – Display Name Dec 21 '16 at 20:51
  • @SargeBorsch “History rewriting” is a bit of a misnomer in Git since commits are immutable. I included cherry-picking in that list because it applies a change without maintaining the history of that change – very much like applying a patch. That is *equivalent* to squashing (which is like aggregating many commits into a single patch), and in fact *exactly identical* to rebasing (which re-applies a number of changes with a different history). The only difference between cherry-picking and rebasing is where the branch labels point afterwards. – amon Dec 22 '16 at 11:17
  • Core paragraph "Continuous Integration", combined with feature flags and a proper (re)design of the application all leads to [Branch by Abstraction](http://martinfowler.com/bliki/BranchByAbstraction.html) – Henk Langeveld Dec 24 '16 at 10:52
  • @PeterGreen The commit it points to should represent its history. Ideally, what you want is `git log --until="$date_y" "$branch_x"`. The only time when that won't work is when you've employed some history rewriting facility like rebase, but such things are not meant to be part of a normal workflow; tracking their actions should be pointless. If it wouldn't be pointless in your case, that means that you're doing it as part of your workflow like how the OP merges by not merging and instead does a squashed commit with broken history. – JoL Dec 25 '16 at 12:05
  • 1
    YES. I love this answer and actually wrote a similar blog post recently (when GitHub made "squash and merge" the default for PRs): https://strugee.net/blog/2016/10/github-squash-and-merge-default-considered-harmful. two things to point out from that post: 1. it's worth noting that Mercurial with Changeset Evolution in theory has none of these problems and 2. it's extremely rare but history rewriting (_especially_ - I think - super destructive rewriting like squashing) can give you a Very Bad Time if you're maintaining an independent downstream fork that regularly merges from upstream. – strugee Dec 26 '16 at 09:54
111

I like Amon's answer, but I felt one small part needed a lot more emphasis: You can easily simplify history while viewing logs to meet your needs, but others cannot add history while viewing logs to meet their needs. This is why keeping the history as it occurred is preferable.

Here's an example from one of our repositories. We use a pull-request model, so every feature looks like your long running branches in history, even though they usually only run a week or less. Individual developers sometimes choose to squash their history before merging, but we often pair up on features, so that's relatively unusual. Here's the top few commits in gitk, the gui that comes bundled with git:

standard gitk view

Yes, a bit of a tangle, but we also like it because we can see precisely who had what changes at what time. It accurately reflects our development history. If we want to see a higher-level view, one pull request merge at a time, we can look at the following view, which is equivalent to the git log --first-parent command:

gitk view with --first-parent

git log has many more options designed to give you precisely the views you want. gitk can take any arbitrary git log argument to build a graphical view. I'm sure other GUIs have similar capabilities. Read the docs and learn to use it properly, rather than enforcing your preferred git log view on everyone at merge time.

Karl Bielefeldt
  • 146,727
  • 38
  • 279
  • 479
  • 24
    I was not familiar with this option! This is a huge help for me, and it sounds like learning more log options will let me live with this more easily. – Standback Dec 19 '16 at 17:42
  • 26
    +1 for the "You can easily simplify history while viewing logs to meet your needs, but others cannot add history while viewing logs to meet their needs." Re-writing history is always questionable. If you thought it important to record at the time of commit then it was important. Even if you found it was wrong or re-did it later that is part of the history. Some mistakes only make sense when you can see in blame that this one line was left over from a later re-write. When mistakes are folded in with the rest of the epic you cannot review why things ended up how they are. – TafT Dec 20 '16 at 08:30
  • @Standback Related, one thing that helps maintain this structure for me, is using `merge --no-ff` - don't use fast-forward merges, instead always create a merge commit so that `--first-parent` has something to work with – Izkata Dec 24 '16 at 23:29
34

Our issue is with long-term sidebranches -- the kind where you've got a few people working a sidebranch that splits from master, we develop for a few months, and when we reach a milestone we sync the two up.

My first thought is - don't even do this unless absolutely necessary. Your merges must be challenging sometimes. Keep branches independent and as short-lived as possible. It's a sign that you need to break your stories up into smaller implementation chunks.

In the event that you have to do this, then it is possible to merge in git with --no-ff option so that the histories are kept distinct on their own branch. The commits will still appear in the merged history but can also be seen separately on the feature branch so that at least it's possible to determine which line of development they were part of.

I have to admit when I first started using git I found it a little strange that the branch commits appeared in the same history as the main branch after the merge. It was a little disconcerting because it didn't seem like those commits belonged in that history. But in practice, it's not something that's really painful at all, if one considers that the integration branch is just that - its whole purpose is to combine the feature branches. In our team, we don't squash, and we do frequent merge commits. We use --no-ff all the time to ensure that its easy to see the exact history of any feature should we want to investigate it.

Bradley Thomas
  • 5,090
  • 6
  • 17
  • 26
12

Let me answer your points directly and clearly:

Our issue is with long-term sidebranches -- the kind where you've got a few people working a sidebranch that splits from master, we develop for a few months, and when we reach a milestone we sync the two up.

You usually do not want to let your branches unsynced for months.

Your feature branch has branched off of something depending on your workflow; let's just call it master for the sake of simplicity. Now, whenever you commit to master, you can and should git checkout long_running_feature ; git rebase master. This means that your branches are, by design, always in sync.

git rebase is also the correct thing to do here. It is not a hack or something weird or dangerous, but completely natural. You lose one bit of information, which is the "birthday" of the feature branch, but that's it. If someobody finds that to be important, it could be provided by saving it somewhere else (in your ticket system, or, if the need is great, in a git tag...).

Now, IMHO, the natural way to handle this is, squash the sidebranch into a single commit.

No, you absolutely do not want that, you want a merge commit. A merge commit also is a "single commit". It does not, somehow, insert all the individual branch commits "into" master. It is a single commit with two parents - the master head and the branch head at the time of the merge.

Be sure to specify the --no-ff option, of course; merging without --no-ff should, in your scenario, strictly be forbidden. Unfortunately, --no-ff is not the default; but I believe there is an option you can set that makes it so. See git help merge for what --no-ff does (in short: it activates the behaviour I described in the previous paragraph), it is crucial.

we're not retroactively dumping months of parallel development into master's history.

Absolutely not - you are never dumping something "into the history" of some branch, especially not with a merge commit.

And if anybody needs better resolution for the sidebranch's history, well, of course it's all still there -- it's just not in master, it's in the sidebranch.

With a merge commit, it is still there. Not in master, but in the sidebranch, clearly visible as one of the parents of the merge commit, and kept for eternity, as it should be.

See what I've done? All things you describe for your squash commit are right there with the merge --no-ff commit.

Here's the problem: I work exclusively with the command line, but the rest of my team uses GUIS.

(Side remark: I almost exclusively work with the command line as well (well, that's a lie, I usually use emacs magit, but that's another story - if I am not in a convenient place with my individual emacs setup, I prefer the command line as well). But please do yourself a favour and try at least git gui once. It is so much more efficient for picking lines, hunks etc. for adding/undoing adds.)

And I've discovered the GUIS don't have a reasonable option to display history from other branches.

That is because what you are trying to do is totally against the spirit of git. git builds from the core on a "directed acyclic graph", which means, a lot of information is in the parent-child-relationship of commits. And, for merges, that means true merge commits with two parents and one child. The GUIs of your colleagues will be just fine as soon as you use no-ff merge commits.

So if you reach a squash commit, saying "this development squashed from branch XYZ", it's a huge pain to go see what's in XYZ.

Yes, but that is not a problem of the GUI, but of the squash commit. Using a squash means you leave the feature branch head dangling, and creating a whole new commit into master. This breaks the structure on two levels, creating a big mess.

So they want these big, long development-sidebranches merged in, always with a merge commit.

And they are absolutely right. But they are not "merged in", they are just merged. A merge is a truly balanced thing, it has no preferred side that is merged "into" the other (git checkout A ; git merge B is exactly the same as git checkout B ; git merge A except for minor visual differences like the branches being swapped around in git log etc.).

They don't want any history that isn't immediately accessible from the master branch.

Which is completely correct. At a time when there are no unmerged features, you would have a single branch master with a rich history encapsulating all feature commit lines there ever were, going back to the git init commit from the beginning of time (note that I specifically avoided to use the term "branches" in the latter part of that paragraph because the history at that time is not "branches" anymore, although the commit graph would be quite branchy).

I hate that idea;

Then you are in for a bit of pain, since you are working against the tool you are using. The git approach is very elegant and powerful, especially in the branching/merging area; if you do it right (as alluded to above, especially with --no-ff) it is by leaps and bounds superiour to other approaches (e.g., the subversion mess of having parallel directory structures for branches).

it means an endless, unnavigable tangle of parallel development history.

Endless, parallel - yes.

Unnavigable, tangle - no.

But I'm not seeing what alternative we have.

Why not work just like the inventor of git, your colleagues and the rest of the world do, every day?

Do we have any option here besides constantly merging sidebranches into master with merge-commits? Or, is there a reason that constantly using merge-commits is not as bad as I fear?

No other options; not as bad.

AnoE
  • 5,614
  • 1
  • 13
  • 17
10

Squashing a long term sidebranch would make you lose a lot of information.

What I would do is try to rebase master into the long term sidebranch before merging the sidebranch into master. That way you keep every commit in master, while making the commit history linear and easier to understand.

If I couldn't do that easily at each commit, I would let it be non-linear, in order to keep the development context clear. In my opinion, if I have a problematic merge during the rebase of master into the sidebranche, it means the non-linearity had real-world significance. That means it will be easier to understand what happened in case I need to dig into the history. I also get the immediate benefit of not having to do a rebase.

  • :+ for mentioning `rebase`. Whether or not it's the best approach here (I personally haven't had great experiences with it, although I haven't used it much), it definitely seems to be the nearest thing to what OP really seems to want--specifically, a compromise between an out-of-order history dump and completely hiding that history. – Kyle Strand Dec 22 '16 at 23:00
1

Personally I prefer to do my development in a fork, then pull requests to merge into the primary repository.

That means that if I want to rebase my changes on top of master, or squash some WIP commits, I can totally do that. Or I can just request that my whole history be merged in as well.

What I like to do is do my development on a branch but frequently rebase against master/dev. That way I get the most recent changes from master without having a bunch of merge commits into my branch, or having to deal with a whole load of merge conflicts when it's time to merge to master.

To explicitly answer your question:

Do we have any option here besides constantly merging sidebranches into master with merge-commits?

Yes - you can merge them once per branch (when the feature or fix is "complete") or if you don't like having the merge commits in your history you can simply do a fast forward merge on master after doing a final rebase.

Wayne Werner
  • 2,340
  • 2
  • 23
  • 23
  • 3
    I'm sorry - I do the same thing, but I don't see how this answers the question :-/ – Standback Dec 19 '16 at 21:24
  • 2
    Added an explicit answer to your stated question. – Wayne Werner Dec 19 '16 at 21:29
  • So, that's fine for my own commits, but I (and my team) will be dealing with *everybody's* work. Keeping my own history clean is easy; working in a messy dev history is the issue here. – Standback Dec 19 '16 at 21:50
  • You mean you want to require other devs to... what? Not rebase and just squash on merge? Or were you just posting this question to gripe about your coworkers lack of git discipline? – Wayne Werner Dec 19 '16 at 21:56
  • I'm sorry, no, that's not what I meant at all. Sorry if I was unclear. :) My question is about team policy and workflow for bringing big, long-running branches (with multiple developers) back into the master branch; your answer is only relevant to changes that are local, short-term, and all belong to me. – Standback Dec 19 '16 at 22:31
  • 1
    regularly rebasing is not useful if several developers are working on that feature branch. – Paŭlo Ebermann Dec 20 '16 at 00:30
  • @PaŭloEbermann not so. If you have several devs working on the same feature you simply treat one of the forks as the primary branch, and everyone rebases on that. Presumably the team lead. Or just always rebase before you push. Then the entire branch gets rebased... or merged in to master. – Wayne Werner Dec 20 '16 at 06:41
  • @Standback why doesn't this answer your question? If you can't give a specific "rebasing doesn't work because..." then I submit you don't understand your problem. Or rebasing ;) – Wayne Werner Dec 20 '16 at 06:44
-1

Revision control is garbage-in garbage-out.

The problem is the work-in-progress on the feature branch can contain a lot of "let's try this ... no that didn't work, let's replace it with that" and all the commits except the final "that" just end up polluting the history uselessly.

Ultimately, the history should be kept (some of it might be of some use in the future), but only a "clean copy" should be merged.

With Git, this can be done by branching the feature branch first (to keep all the history), then (interactively) rebasing the branch of the feature branch from master and then merging the rebased branch.

DepressedDaniel
  • 936
  • 5
  • 6
  • 1
    Just to clarify: what you're saying is to rewrite the history on (a copy of) the feature-branch, so it's got a short, clean history - eliminating all the back-and-forth of initial development. And then, merging the rewritten branch into master is simpler and cleaner. Did I understand you correctly? – Standback Dec 19 '16 at 22:37
  • 1
    Yes. And when you merge into master it will just be a fast forward (assuming you rebased recently before merging). – DepressedDaniel Dec 19 '16 at 23:04
  • But how is this history then kept? Do you let the original feature branch lying around? – Paŭlo Ebermann Dec 20 '16 at 00:32
  • @PaŭloEbermann Bingo. – DepressedDaniel Dec 20 '16 at 00:39
  • Well, as someone else said in a comment: branches are just pointers into the history graph in `git`, and they are local. What is named `foo` in one repo may be named `bar` in a different one. And they are meant to be temporary: You don't want to clutter your repo with thousands of feature branches that need to be kept alive in order to avoid loosing history. And you would need to keep those branches in order to keep the history the reference, for `git` will eventually delete any commit that's not reachable by a branch anymore. – cmaster - reinstate monica Dec 25 '16 at 21:48
  • @cmaster There are simple solutions for the clutter aspect, just do `git update-ref refs/hidden/branch branch; git branch -D branch` as suggested here: git.661346.n2.nabble.com/how-to-hide-some-branches-tad1594799.html – DepressedDaniel Dec 25 '16 at 22:32