git pull command seems convenient, but it actually does a
bit much at once. It fetches from remote, attempts to merge into
your local stuff, and, if successful, commits. If you’ve
committed your work locally prior to the
git pull, as
recommended by the
git documentation, the resulting commits are
not as simple and digestible as one would hope. This piece,
addressing novice and intermediate level
git users, discusses
the situation in some detail and recommends a combination of
Let’s assume you are an industrious developer who has just
prepared a contribution to some project. You’re done, all unit
tests are happy, and now you want your work to become available
on some branch on
master, so other people can see
Other members of your team may have changed
master. At this
point in time, you don’t yet know. To find out, you can, of
git pull. If your teammates have not been working
master, or done only work that’s easily merge-able, you’re
In the present scenario, we assume you have completed your changes locally, but have not committed them yet. Which means: You have no backup of your own work yet.
Git is afraid that it and you with combined forces might mess up
your workspace during conflict resolution. So, if there is a
conflict with uncommitted work, the “pre-merge checks” of the
git merge part of
git pull fail and nothing happens. No harm
has been done, but no progress has been made, either.
Let’s take a step back and survey the situation. What’s up?
You certainly don’t want to risk a conflict resolution without a
backup. So you want a commit first. This is also the approach
recommended by the
git merge documentation: It discourages
merging on top of uncommitted changes.
So, the route to take is: Commit your local stuff first. After all, this is a version system. We should be using it to our advantage.
I see no point to not heed that advice always, as a habit. You’re done with a piece of work? Commit locally first thing, before even looking what the other people have done in the meantime. I do it that way, and, in my experience and judgment, this is a good habit to have.
So, with that in our mind, let’s start our little story one more time. Again, you’ve completed a job and the tests are happy.
This time, you commit your work first thing, on top of whatever
dated material your local
master happens to hold. You first
want to safely tug your own stuff away. If you do that, you have
a commit, and that commit is there to stay. You can always come
back to it, no matter what. From this point on, your work is
safe. After that’s accomplished, you have a sound basis for
facing the merge work that’s ahead of you.
So, how do you go about that merge work? Maybe a
git pull now?
Yes, you can use
git pull now. But I argue that plain
pull is not what you want, once you’ve adopted the recommended “check
in first” habit.
To see why, let’s assume independent changes come in from other
members of your team. These are diligently merged by
In many cases, the merge is done fully automatic. All is
fine. – Or is it?
That merge which
git pull has produced for you is – well, a merge.
It’s a commit with two parents.
There is nothing inherently wrong with merge commits, commits with two parents (or, though rarely seen, even more than two). But those do add a cognitive burden. For yourself and your fellow workers, it just isn’t as easy to see what your commit is up to.
When you look at it via a git UI tool, that tool has two diffs it could show you. Depending on the tool, it might decide to show you none. Again: Unless you know how to specifically ask and use that knowledge, your git UI is likely to not show you anything about what happened in that merge commit.
In my experience, you do yourself and your team no favor if you use merge commits for trivial cases like this one.
Concrete example: This blog post was drafted the night after we cleaned up a botched merge, which needn’t have been a merge in the first place. Incidentally, should you have experienced problems with this innoQ homepage some time between Jan 25 in the evening through Jan 26, 2016, that botched merge commit most likely was the underlying cause. Initial repair attempts didn’t cut through to the root of the problem. In the end, some five innoQ developers teamed up to fully rectify the situation. Our repair work would have started earlier and progressed faster, had the botched two-parent merge commit been a plain normal single-parent commit instead.
Assuming you happen to be the kind of person wanting to learn from other people’s troubles: Use those merge commits only for more serious branch work.
So, if not
git pull, what else? Let me come back to our
original scenario one last time. Work is done, tests are happy,
now: What would I do? What do I actually do, in such a
I initially add a commit with my work to whatever dated version
master I’ve been working on. This is a version system, I
want to take advantage of that and make sure my work is safe. So
far, so obvious (by now).
Next thing, I might simply try
git push. If I’m lucky, nobody
else has touched
master and I’m finished. If I’m not, no harm
is done, either.
Nice try, but most of the time, that push doesn’t work. In that
case, I’d now run
That done, I now have my own commit on my personal
well as my team-mates’ results in
Now, I want to do merge work, but without actually producing a
git rebase origin/master does that trick for me.
What does that do? It grafts a copy of my commit on top of the
new stuff on
origin/master. That new commit copy has only one
single parent, namely, the previous latest commit of
origin/master. Locally, this commit also becomes the new
HEAD of my local
master which I’m on, with all
the work from the other team members integrated in its history.
Admittedly, this emits a certain odor, as
git rebase entails
some amount of “rewriting of history”. The new commit pretends
I’ve started my work on the basis of that previous latest commit
origin/master, while in fact I’ve started based on earlier
But this is only a minor amount of “rewriting of history”. As presented here, I only manipulate my own local commits, which I have not yet shared with anyone. Such limited “rewriting of history” I consider quite tolerable.
In the trivial situation, when the merge work can be done
git rebase will leave me with a version of
master that I can test one more time and then
Should I face a merge conflict, I’ll have to resolve and commit
manually. If I manage to do that, fine, final test and
push again and all is well.
Should I get the merge work wrong on first try, I can back out
and try again. My original commit is still patiently waiting to
see whether it’s still needed. I just have to dig up its SHA.
To do so, I’ll just scroll up my terminal window, or else use
gitk’s “view all refs”.
For my second attempt at the merge work, I want to reestablish my
master to point to my commit. A straightforward
way is to delete it and created it anew:
Admittedly, that’s more robust than elegant. As a consequence of
git does not yet know where to push the new
master, so I shall need the explicit
git push -u origin master.
Fortunately, this problem is self-healing: The connection to
origin is reestablished by this explicit push
But elegant or not, it works. I can retry my
origin/master merge work as often as I require to get it right.
After the push, my colleagues will get to see a sole nice commit with a single parent, easily comprehensible. No uncalled-for cognitive burden here.
In conclusion, I want to emphasize that
git is a version system
which I, the developer, feel free to use locally as I please. I
can produce as many commits on my local
master or any other
local branches as is convenient for me. I might use frowned-upon
zero-information check-in comments such as “work in progress”.
If I so desire, I might even do a commit just to keep (via the
commit timestamp) a record of my departure time for lunch
None of these commits need (or should) ever become visible via
origin. When a particular piece of work is complete, I squash
all related commits into one, using
git rebase -i origin/master
(either before or after
git fetch, as I please). During that
process, I also come up with an informative check-in comment for
the whole thing. The end result is another sole shiny
well-commented single-parent commit.
So, now you have my reasons for my reservations regarding
pull. In contrast, I consistently find
git rebase to be my
Those of you not wanting to resist the convenience lure of
pull, consider taming it with one of the
Basic command line
git diff SHA1 SHA2works well in this situation, and I highly recommend you have it in your toolbox, including the
When accepting a pull request, I like to document my review work by forcing a two-parent commit, even where a fast-forward would be possible. ↩
In case you care, my actual habit is to use
git fetch -p. But the
-pis irrelevant in the present context. ↩
If either you and your project disagree, you’ll probably have to live with double-parent merge commits and suffer the consequences. You may be able to reduce the number of such commits by taking the “feature branch” route. This helps somewhat, as long as only one person actively commits into each feature branch. ↩
No, even I haven’t actually done that. But you see my point. ↩