183

I'm learning git and I've noticed that it has a two-step commit process:

  1. git add <files>
  2. git commit

The first step places revisions into what's called a "staging area" or "index".

What I'm interested in is why this design decision is made, and what its benefits are?

Also, as a git user do you do this or just use git commit -a?

I ask this as I come from bzr (Bazaar) which does not have this feature.

thomasrutter
  • 2,311
  • 3
  • 16
  • 13
  • 3
    +1 for asking. I use Tortoise SVN, which has the same approach and I never understood why. – DPD Apr 18 '11 at 10:34
  • 3
    The staging area isn't that unusual. The equivalent in, say, TFS would be checking or unchecking the box next to a file before checking in. Only the checked files get committed. The difference with Git is that if you use `git add -p`, you can choose to commit one piece of a file while not committing another piece of the *same file*. – Kyralessa Sep 30 '11 at 03:57
  • I found that this [link](http://gitolite.com/uses-of-index.html) summarizes most of what was answered here and adds a few more use cases to justify the need for staging. – Veverke May 04 '15 at 17:15
  • 2
    This question is actually already answered, but here is also one good explanation: http://stackoverflow.com/questions/4878358/why-would-i-want-stage-before-committing-in-git – radekEm Mar 07 '16 at 20:47
  • 1
    Don't forget `git status` and possibly `git push`. For all the hype about git, (and GitHub sharing code is wonderful) parts are very annoying – user949300 Mar 17 '17 at 18:22

4 Answers4

83

Split work into separate commits. You've probably many times opened a file to write a single-line fix, but at the same time you spotted that the formatting was wrong, some documentation could be improved, or some other unrelated fix. With other RCSs you'd have to write that down or commit it to memory, finish the fix you came for, commit that, and then return to fix the other stuff (or create a ball-of-mud commit with unrelated stuff). With Git you just fix all of it at once, and stage+commit the single line separately, with git add -i or git-gui.

Don't break the build. You're working on a complicated modification. So you try different things, some of which work better than others, some which break things. With Git you'd stage things when the modification made things better, and checkout (or tweak some more) when the modification didn't work. You won't have to rely on the editor's undo functionality, you can checkout the entire repo instead of just file-by-file, and any file-level mistakes (such as removing a file that has not been committed or saving+closing after a bad modification) does not lead to lots of work lost.

l0b0
  • 11,014
  • 2
  • 43
  • 47
  • 3
    Coming from a DVCS (bzr) that does not have this feature, that sounds a lot like what I currently achieve with a combination of liberal use of my editor's "undo" buffers, the "revert " command and selective commits ("commit "). Sounds like this feature of git has the potential to be more methodical. – thomasrutter Apr 19 '11 at 06:21
  • 1
    Regarding "other RCSes", that is not necessarily true. In fact, you can achieve those same functionalities in Mercurial using [patches](http://mercurial.selenic.com/wiki/MqExtension). – Lucio Paiva Oct 12 '14 at 17:04
  • 1
    @l0b0, about your second point. If there was just a single stage commit, you could have just committed the changes (that you use with git add) directly as a commit. If you found out that you did something wrong, you would have just deleted the commit, and got back to where you were before you made the commit. With the staging concept, aren't you just doing that, but adding more complexity? – alpha_989 Apr 10 '18 at 17:15
  • Your first point makes sense, though I haven't used it, so far. Theoretically why can't you do something like a `git add -i` with a single stage commit? You would just pick a bunch of files (or lines within files) related to a single feature, and do a commit. Then you would come back and do a second commit related to another feature.. – alpha_989 Apr 10 '18 at 17:18
  • @thomasrutter, From your statement, it seems you are suggesting that the staging area creates "manual undo points". In VIM with persistent-undo, you can get unlimited history very reliably. This is also tracked automatically in a `git-branch` type fashion (https://jovicailic.org/2017/04/vim-persistent-undo/). Further your undo history is automatically tracked everytime you go into normal mode. So it reduces your mental burden of having to create "manual undo points". Why is using your editors "undo" buffers not as methodical? – alpha_989 Apr 10 '18 at 17:24
  • 3
    -1 because the staging area doesn't actually have anything to do with the first point here. @alpha_989's question is bang on: allowing the user to commit only some of their working directory's changes at once doesn't require two-stage commit, as evidenced by Mercurial, which has one-stage commit but still offers `hg commit --interactive`. – Mark Amery Oct 16 '18 at 22:45
  • 1
    Both of these use cases are handled in Mercurial without the staging concept. Have read about ten attempts to explain the point of staging so far and all fail because the authors seem ignorant of the capabilities of other VCSs, as in this case. – cja Feb 01 '20 at 21:09
65

One of the benefits for me is the ability to "add" files progressively. Before committing I review each file. Once the file is reviewed, I add it. When I git status or git diff, git shows me only the files that have been modified and have not been added yet. When I have reviewed all the files and added them, then I can commit.

So yes, I find the staging area very helpful.

And no, I never use git commit -a. However, I often use git add -u. This way I can still visualize what's to be committed.

David
  • 2,724
  • 3
  • 17
  • 18
  • 2
    Exactly. The advantage is much more fine grained control over exactly what you are comitting. – Josh K Apr 18 '11 at 16:27
  • what happens when you stage one file multiple times? does it get "merged" in the staging area? – m4l490n Jul 26 '17 at 21:59
23

The benefit is quite simple: it gives you full control over which files you want to commit when. For that matter, you can use git add -p to control which lines you want to commit.

Rein Henrichs
  • 13,112
  • 42
  • 66
  • 2
    I've always wondered about how to do this. I wish there was a file `.gitignorelines` so you could make local changes to individual lines that could survive commits, and remain intact. – alex gray Sep 29 '13 at 15:48
  • Ignoring local per-line changes and ensuring that your code differs from everyone else's in ways only you know about at the line level? That sounds like an anti-feature to me. – Rein Henrichs Sep 29 '13 at 22:43
  • 3
    @ReinHenrichs, think about config files that need changing by each developer. – Ian Apr 28 '14 at 10:56
  • 1
    @Ian So part of the file changes infrequently and is shared and part of the file changes often, in incompatible ways, and isn't shared? Supporting this false connascence definitely sounds like an anti-feature. – Rein Henrichs Apr 29 '14 at 16:58
  • 1
    @ReinHenrichs, yes and it is very common when the file contains the name of the database server and each dev has there own database. – Ian Apr 29 '14 at 18:43
  • 4
    @Ian Your problem is really there, that you have a file which is supposed to be the configuration for the application also contain some machine/dev specific configuration. All configuration systems I know of allow you to split that into multiple files. So for example you have your `app.conf` which contains the stuff you want shared, and then a `db.conf` which you just put on the .gitignore list. Problem solved. If you're using something proprietary you should really look into getting something so simple in there. Or put it through a preprocessor in a pre-build event. Many solutions there. – Aidiakapi Feb 28 '15 at 01:30
  • @ReinHenrichs, Even if the 2 stage model was a 1 stage model, could you just have committed the specific files into a single commit, in 1 shot? Also, why is the 2 stage model necessary to be able to do 'git add -p`. I don't necessarily see a relationship between being able to add specific lines to git commit, and the 2 stage model. Could you point me to some references, or any tutorial on `git add -p`? – alpha_989 Apr 10 '18 at 17:29
1

One of the benefits that I like is the ability to commit a portion of a change. Ie., by using git add -e. I do not commit as often as I should sometimes, and the git add -e command lets me unravel my changes to an extent.

leed25d
  • 1,139
  • 6
  • 6