How do I approach a complicated merge

Question

Here is the deal, I have joined a new company and have been asked to finish off the work on a branch which hasn't been touched for almost a year. In the meanwhile, the master branch has been growing with a steady pace. Ideally I would like to merge all of the changes from the master branch into the feature branch and continue the work from there, but I'm not too sure how to approach this.

How do I perform this merge safely while preserving important changes on both sides of the branch?

Thank you to everybody for awesome feedback. I'm going to give git-imerge a go and if that turns out to be messy I'll use a new branch approach! — Allan Spreys, Nov 11 '15 at 19:27
I prefer rebasing in this situation as it will go commit by commit. It will also allow you to squash the new feature before you make the merged code available. YMMV. — Stephen, Nov 19 '15 at 07:14

score 34 · Accepted Answer · 2015-11-11T12:10:25.750

At its heart, how to combine two (possibly non-compatible) pieces of code is a development problem, not a version control problem. The Git merge command may help in this process, but it depends on the shape of the problem.

Comparing both versions with the base first makes the most sense. This will give you an idea of the best strategy for taking this forward. Your approach might be different based on the nature and overlap of the changes in each branch.

Imagine the ideal scenario: you would discover that the main branch and the feature branch each only modified mutually exclusive parts of the code, so you could just commit all the changes in and be good to go.

Of course, that will almost certainly not be the case, but the question is: how far removed from this ideal scenario will it be? i.e. how intermingled are the changes?

Also, how mature was the old feature branch? Was it in a good working state, or not (or unknown)? How much of the feature was finished?

If the relevant code in the main branch has changed a lot in the past year, or the feature is not in a very mature state, I might consider creating a new fork of the latest and manually incorporating the old feature in again. This will allow you to take an incremental approach to getting it working.

If you do a messy merge of lots of code and it doesn't work, it will be quite hard to debug. If the main branch has changed a lot over the past year, major design changes may be needed to the feature to get it working. It would not be appropriate to make these changes via "resolve conflicts", since this would require making all the changes at once and hoping it works. This problem would be compounded by the possibility of bugs in the old partially finished branch.

+1 "I might consider creating a new fork of the latest and manually incorporating the old feature in again" — mika, Nov 18 '15 at 09:31
The first key question is: does the old feature branch build? Does it run? If not it's going to be very hard to test your merge. — Móż, Nov 19 '15 at 23:47

score 28 · Answer 2 · edited Nov 19 '15 at 23:12

28

In my limited git experience I can say that sometimes it's faster to restart the feature branch over again if the master has gone too far from the detach point.

Merging two branches without knowing the history behind the code (given that you've just joined the project) is really difficult, and I bet that even a developer who followed the project from the beginning would likely make some mistakes in the merge.

This of course makes sense if the feature branch is not huge, but you could simply keep the old feature branch opened, branch again from master and manually re-introduce changes that compose that feature. I know it's the most manual approach, but it allows you to be in complete control in case of missing or moved code.

Pair programming with a senior in this case would be the best case scenario, helping you to get to know the code better.
It could even turn out to be faster too, if you take into account merge conflicts and testing time!

I kinda of assumed that at least trying to do a merge is obviously the best thing to do. If that fails or turns out to be too difficult, then try the cherry picking, if that goes wrong go the manual way.

edited Nov 19 '15 at 23:12

answered Nov 11 '15 at 08:36

GavinoGrifoni

517
4
11

2

Defintiely not the correct approach. – Andy Nov 11 '15 at 08:37
@DavidPacker I know it's not the pure git approach, but I see it working in big projects where the merge could have changed the master **a lot** – GavinoGrifoni Nov 11 '15 at 08:38
If the merge alters the master in an unwanted way, you are not using git correctly. – Andy Nov 11 '15 at 08:40
13

no, this is the correct approach - if you have an ancient branch, and you do not know the code, trying to merge is not going to be very successful and it is going to be very risky. Determining the changes that were originally made, investigating whether they are even relevant to the new code, and then applying them to make sense is the way to do it. This is a manual approach, but in these circumstances, the only safe one to take. Of course, I'd still try a merge first, just to see what happens, and I'd check the log to see just how much change occurred on the branch too - could be trivial. – gbjbaanb Nov 11 '15 at 11:52
@gbjbaanb By doing what GavinoGrifoni suggests, you are throwing away potentially weeks of work. Just to do it over again. That's a lot of $ which the management does not want to spend. You are supposed to finish the work, finish it, because the foundation is probably solid, not to do it all over again, because doing it that way, you are most likely going to write similar code, which in return means lost value to the firm, because the code existed in the first place. – Andy Nov 11 '15 at 12:12
13

@DavidPacker: I do not think GavianoGrifoni suggests to throw all the work overboard. He suggests to transfer the changes from the old branch manually to the current line of development, in a step-by-step manner. That will throw the old **history** away, not more. – Doc Brown Nov 11 '15 at 12:16
@dan1111 Copying it and pasting it in new branch will result in the same changes, in the same files, as it would initially during the merge conflict. That's how git works. You are not saving anything, but merely moving merge conflicts to modified state. And when you are modified files, you are going to have to go through them all anyway, because you don't know what to replace and what should be left in place. Merge conflict is better, as it specifically tells you where there is a problem. Manually copying it from the branch has you browsing files trying to find the changes - waste of time. – Andy Nov 11 '15 at 12:17
@dan1111 If you don't do that, you are doing the work that has already been done. Throwing away costs, which management won't like. The user has been tasked with finishing the work, not doing it all over. – Andy Nov 11 '15 at 12:21
@DavidPacker: any suggestion how to use git in a way which supports "take a **small part** of the changes from the old branch, merge it with the master, compile and test it, fix bugs, then repeat again"? That is what is needed here, and I do not see how your suggested approach will support that. – Doc Brown Nov 11 '15 at 12:23
@DocBrown You can cherry pick changes from the obsolete branch to your new one. But it all leads to merge conflicts. You are not simply getting around them. Merge conflicts are not evil, git provides them to help you merge branches together faster. It's a great feature. Note that cherry picking takes longer than simply doing what I described in my answer. And leads to the same result. – Andy Nov 11 '15 at 12:25
4

@DavidPacker 1st: the branch is a year out of date, 2nd: the guy tasked with finishing it doesn't know the code at all. Given these 2 factors, a manual re-apply is the only realistic way to approach the task. Nobody is suggesting a simple copy-paste of the old branch's tip revision. – gbjbaanb Nov 11 '15 at 12:59
1

`You can cherry pick changes...` - a world of pain when things depend on other things, which is typical of nearly every system ever developed. – JᴀʏMᴇᴇ Nov 11 '15 at 13:48
@gbjbaanb Manual reapply is what the user will end up doing if he merges the branches together during the merge conflict resolution. It's comfortable, because git exactly shows him where the problems are. The user is not forced to search for them. I don't see how, in the end, your approach is different from mine, except only taking longer time to find what has been changed and needs to be revised. – Andy Nov 11 '15 at 14:05
8

@DavidPacker: Merge conflicts can become evil - if you have to resolve 500 of them at once before you get the program in a compilable and testable state again. That is the kind of situation the OP is expecting here. If you think its possible to use git in an efficient manner to avoid this "all-or-nothing" situation, why don't you edit your answer and tell the OP how this can be accomplished? – Doc Brown Nov 12 '15 at 13:18
3

@DavidPacker, I wonder if you have ever actually tried what you are suggesting. I know I have, and in my experience a manual re-application of the changes has always been less painful. – Mike Chamberlain Nov 13 '15 at 08:21
@MikeChamberlain Of course I have. I would not recommend something I am not using myself and don't find good. Teams I am leading don't like the merge conflicts resolutions at first either (mostly because it messes up code highlighting and inspection), but we found out it leads to about 30 % less time needed to go through all the parts that are problematic and fix them (this is a sample from about 80-100 situations like this one), rather than doing it regularly, using the manual approach. – Andy Nov 13 '15 at 08:49
Cherry picking, one merge at a time means that you probably do more work in total, one bit at a time. The important part is that most cherry picks will be simple merges. That way you can focus wider assistance on the occasional complex cherry pick merges. – Michael Shaw Nov 20 '15 at 11:01

score 20 · Answer 3 · answered Nov 11 '15 at 13:23

git-imerge is designed exactly for this purpose. It is a git tool which provides a method for incremental merging. By merging incrementally, you only need to deal with the collisions between two versions, never more. Furthermore, a far larger number of merges can be performed automatically as the individual changesets are smaller.

score 7 · Answer 4 · 2015-11-16T19:21:10.157

Trying to merge the mainline head onto a year stale branch can be an exercise in frustrations and deepening the dent on the desk with your forehead.

The mainline didn't get to where it is in one go over the course of months. It too had development and releases. Trying to bring it all up to date in one monolithic merge can be overwhelming.

Instead, start out by merging from the first feature merge back into mainline after the stale branch split. Get that merge working. Then the next feature merge. And so on. Many of those feature merges will merge in without conflict. It is still important to make sure that the current functionality of stale branch remains compatible with the direction that the mainline has gone.

You may wish to branch from the head of the stale branch for the role of merging in other changes. This is more about making sure that the commits and the history when someone looks back at it is clear and communicates what the role and policy of each branch is. The stale branch was a feature branch. The one you are working from is an accumulation and reconciliation branch.

Much of this will be easier if the old feature or release branches still exist out there and are easily accessible (some places have a policy of cleaning up the names of branches that are older than some date so that the list of branches isn't overwhelming).

The important thing in all of this is to make sure that you test and fix after successfully merging each part of the mainline history into place. Even though something may merge without conflicts, that just means the code didn't conflict. If the way that the stale feature was accessed was deprecated or removed, there may need to be fixes after the successful merge.

As an aside, this works for other version control systems too. I've occasionally had need to merge a specific group of svn commits into a branch (cherry picking) for one feature, fix the branch to work with that feature, and then merge the next group of svn commits rather than just doing a wholesale svn merge.

While one can do a git cherry-pick here, and it does allow bringing in specific commits, this has some disadvantages that may complicate the process. The cherry pick will not show information about the commit you picked from (you can append it to the commit message). This makes actually tracking the commits in the history harder.

Furthermore, it means that you aren't going to effectively replaying the master onto the stale branch - you are going to be picking possibly incomplete features - and those features may be played out of order.

The key reason that one should merge from historical commits to master onto the stale branch is to be able to keep the, lets call it "future history" of the stale branch in a state that you can reason about. You can clearly see the merges from history onto the stale branch and the fixes to reintegrate the functionality. The features are being added in the same order as they were to master. And when you are done, and finally do the merge from the head of master onto the stale branch, you know that everything has been merged and you aren't missing any commits.

+1 This is an interesting potential alternative solution that uses merge to incorporate large changes. However, I can see a downside: suppose you have major versions A B C D E and you want to incorporate feature branch A1 into E. There may be a lot of wasted effort merging the code into B C and D. For example, what if D to E was a big design change that rendered the incremental changes in B, C, and D irrelevant? Also, it again depends on how mature the feature was in the first place. A useful potential approach, but its appropriateness needs to be considered before starting. — , Nov 20 '15 at 07:05

score 1 · Answer 5 · answered Nov 18 '15 at 22:59

Step 1. Learn about the code, analyze its architecture, and the changes that have been made on both branches since the latest common ancestor.

Step 2. If the feature appears broadly independent and touches mainly different areas of code, merge, fix conflicts, test, fix etc. The is the happy path, you're pretty much good to go. Otherwise go to Step 3

Step 3. Analyze the areas of conflict, understand the functional impact and reasons in detail. There could easily be conflicts in business requirements that come to light here. Discuss with BAs, other devs as appropriate. Get a feel for the complexity involved with resolving the interference.

Step 4. In light of the above, decide whether to aim to merge/cherry-pick/even cut-paste only those parts that do not conflict and re-writing the conflicting pieces, OR whether to re-write the whole feature from scratch.

Andy · Answer 6 · 2015-11-11T08:33:41.650

1. Switch to the branch which is used as a main developer/release branch.

This is the branch which contains the latest changes to the system. Can be master, core, dev, it depends on the company. In your case it is probably master directly.

git checkout master
git pull

Pull to make sure you have the latest version of the main development branch aquired.

2. Checkout and pull the branch which contains the work you are supposed to finish.

You pull to make sure you indeed have the latest contents of the branch. By checking it out directly, without creating it locally first, you ensure not to have the new contents from master (or the main dev branch respectively) in it.

git checkout <name of the obsolete branch>
git pull origin <name of the obsolete branch>

3. Merge the main development branch to the obsolete branch.

Before running the following command, make sure, either by typing git branch or git status that you are on the obsolete branch.

git merge master

The git merge command will try to merge the contents from the specified branch, in this case master, to the branch you are currently at.

Emphasis on will try to. There might be merge conflicts, which will need to be resolved by you and you only.

4. Fix the merge conflicts, commit and push the conflict fix

After fixing the merge conflict in all the files where there is, stage, commit and push the conflict resolution to origin.

git add .
git commit -m "fixed the merge conflict from the past year to update the branch"
git push

You can generally call git add . to stage all the files for commit. When dealing with merge conflicts, you want all the necessary files to be updated.

Additional note

Resolving merge conflict can be a tedious work. Especially if you are new at a company. You might not even have the proper knowledge to resolve all the merge conflicts alone, yet.

Take your time to carefully inspect all the conflicts that have occured and fix them appropriately, before continuing your work.

It can happen so, you start working on an one year old branch, merge the current development state into it and won't have any merge conflicts at all.

This happens when even though the system has changed a lot in the year, nobody has touched the files which were actually altered in the one year old branch.

#4 is potentially the problem. If there were a lot of changes over the last year, major changes to the old feature might be required. Doing this by merge requires you to make major changes to the code all in one go and hope it works, which is not good development practice. Plus who knows what the state of the unfinished feature was? You may end up with a huge mess of non-working code, and who knows whether it was due to problems with the original or changes you made? — , Nov 11 '15 at 12:03
David, as long as this standard approach works, it is fine, and the OP should try this first. But there is definitely a risk of getting too many merge conflicts in the described situation to handle them in this "all-or-nothing" manner. — Doc Brown, Nov 11 '15 at 12:08
@dan1111 That there are merge conflicts is completely OK, and in fact going through them is the way to go. Since the branch was left untouched for a year, you can be pretty sure it is nothing that important and won't affect much of the system. So even though the branch is one year behind, you can get as little as 2 to none merge conflicts. — Andy, Nov 11 '15 at 12:15
The assumption that this branch is unimportant is unwarranted. It could have been a fundamental design change that was abandoned and is now being picked up again. It could be anything. You are right that it *could* be a simple matter and there *could* be few or no conflicts--in which case your answer would be correct. But that is not the only possibility. — , Nov 11 '15 at 12:18
@dan1111 If someone hasn't touched a feature place in a separate branch for a year, it is not going to reflect on system changes that much. This comes from my own experience with obsolete (6+ months old) branches. — Andy, Nov 11 '15 at 12:20
Note: this answer provides useful information, and I would happily upvote it if you edited to make it clear that this is one potential approach and not appropriate in all cases. — , Nov 20 '15 at 07:07

How do I approach a complicated merge

6 Answers6

Additional note

Linked