0

I'm learning Git from here and don't see how resolving conflicts can be anything other than changing your files to match that of the files you're merging from. This is because you can't change the files that you're merging from so you'd have to change your own files, correct? So for example, if there's an additional line of code in the file you're merging from, you'd have to add that line of code to your local file too in order for it to match the original file and then put your non-conflicting code over it.

akantoword
  • 2,824
  • 8
  • 26
  • 43
  • You have it right. So whats the question? The merge or rebase will make those changes to your code and so now your code will have the changes from the branch your are merging from. – David Neiss May 25 '16 at 22:21
  • @DavidN I wanted to confirm I was right but I'm confused by this example - https://help.github.com/articles/resolving-a-merge-conflict-from-the-command-line/ In their example of the conflict resolution, they changed the line to something entirely different than what was on the remote repo and the local repo. So does that mean it will change on both current working directory and the local commit history? What will happen when he pushes this code back to the remote? wouldn't there be another of the same conflict since that new line is different than the original? – akantoword May 25 '16 at 22:23
  • 1
    So when a merge results to changes to a file that dont overlap with changes that you made, then the merge will be clean. If however, the branch you are merging happens to have overlapped with the changes you made, now there is a "merge conflict" that has to be resolved. All this means is that the merge has to be suspended while you update the code to clean up the merge conflict and then continue. Make sense? – David Neiss May 25 '16 at 22:26
  • Ok I see that merge conflicts only happen when I'm merging onto my local repo from a remote. But what happens when there are conflicts that I'm pushing? That's not called a merge, but wouldn't there usually be conflicts that need to be resolved as well on that end? If the merge conflict I described in my previous comment (me pushing a file that has a line that is in conflict with the same line that is already on the remote repo that someone else had just pushed to previously) happened during a push, how does Git know what to do? – akantoword May 25 '16 at 22:35
  • 1
    Excellent question. That could exactly be a problem and for that reason, its usually required that you merge and resolve any merge conflicts on your side and then push such that in this special case, the remote can do a fast forward merge. – David Neiss May 25 '16 at 23:51
  • 1
    The remote is usually configured to reject a push that could not be fast forward merged, and thats when you get an error back when you try to push. – David Neiss May 26 '16 at 00:13

3 Answers3

2

Push does not merge

This is very long, so I have broken it up a bit. If you like, jump to the bottom, where "push does not merge" is repeated.

You are proceeding from several faulty assumptions (not your fault, really, as Git documentation is ... let's just say "not so great" :-) ).

  1. To understand merge, you need to understand the concept of a merge base. The general idea is to combine two different sets of changes. It also helps to know how Git stores commits, in a Directed Acyclic Graph or DAG.

  2. Regarding this (from comments):

    Ok I see that merge conflicts only happen when I'm merging onto my local repo from a remote.

    This is not true either, although that is a typical case. Let's take a closer look in just a moment.

  3. This part:

    But what happens when there are conflicts that I'm pushing?

is perhaps the worst-documented thing of all, in Git.

Objects

Before we can get started, the first thing to know is that Git stores everything in an object database. This database forms the main bulk of the repository (there are a lot of ancillary items, like hooks and reflogs and so on, that are also part of a repository, but it's the objects that are the real heart, and these are almost the only things that get copied when you clone a repository). Each object has an identfier—an ID, a "true name" if you will—that is one of those big ugly 40-character SHA-1 things you have no doubt seen, like bbc61680168542cf6fd3ae637bde395c73b76f0f.

The ID of a Git object uniquely identifies that object. It is the only object in the repository with that ID. There are only four kinds of objects (commits, trees, "blobs", and annotated tags). If Git did not have to be useful, and could just do graph theory things, it would only really need one, namely the commit; and a lot of Git really just deals with that one object type.

The commit graph—the DAG—is formed by looking at all the commits simultaneously. When doing this part, we get to ignore all the other objects in the repository.

The commit DAG

Let's have a look at this DAG thing. This is shown (with more cartoons than necessary, which is fine, and unfortunately less math than necessary, which is not) on the page you linked, but it needs a bit more.

In Git, each commit stores several items:

  • A name (and email and timestamp): A U Thor <author@example.com> made this commit (on whatever date). (In fact, Git stores two of these, one for the author and one for the committer, though most of the time you can ignore this.)
  • A source tree snapshot.
  • A parent commit ID. Actually, this is a list of zero-or-more IDs, but usually there is just one. If there are zero parents, this is a root commit. If there are two (or more but let's not go there), this is a merge commit.

It's these parent IDs that give rise to the graph: the tip commit on a branch points to its parent, which points back to another parent, which points back yet again. This gives us those lovely:

... <- o <- o <- o

diagrams, which on your linked page are mostly drawn more like:

...--o--o--o

I'll draw them like either of these, and sometimes, to mark a particularly interesting commit, write it as *. Towards the end, I will use some single uppercase letters, so that I can talk about particular commits in the paragraphs below the graph.

It's important to note that each of these connections is actually an arrow: it's easy to go from any commit node to all of its parents, but it is hard to go from a parent to any of its children. This matters a lot later, if you get deep into Git, but right now it matters a little because of something else deeply important in Git:

  • Commits (in fact, all objects) are immutable.

Since a commit is immutable, once a commit exists, it always points to the same fixed parent(s). You can add new child commits later because they point back to their parent, so you can add to the repository, but it is literally impossible to remove anything, ever. This is of course a problem, so there is a way to pretend-remove items, and pretend-removed stuff eventually gets garbage collected and really removed. How quickly it gets "really removed" is a separate issue (which I will skip over here as this is very long already).

As we just noted, a merge commit is just a commit that points to at least two parents. What about a "branch point" commit? That is, what happens if—in whatever way; how this happens, we'll get to in a bit—what happens if we start with a straight line of commits all pointing to parents:

... <- o <- o <- *

and then add to this, two commits that both point to the current tip-most * commit? That is, somehow we add two more commits but not in a straight line:

             o
            /
...--o--o--*
            \
             o

Now there are two different branch-tips, and we can keep adding commits to both branches.

But ... how did we manage this? How did we get * to be at the base of a divergence of two different branches? The answer lies in Git's references. Git's branch names are the most common and familiar form of reference, so let's touch on these.

References, specifically branch names

In Git, a branch name is just a human-readable name, of your own choosing, that contains the ID of a commit. That's it: one commit, a single solitary commit, one count it one commit.

This commit—the commit whose ID is stored in the branch name—is the tip of the branch. In fact, that's how we find the actual branches: by starting at these tips. The branches are the data structures inside the DAG, and the branch names—which we (and Git) also call "the branches"—are just the names that let us locate the branch tips, which let us find the actual branches.

Let's draw that graph again and put branch names on:

             o   <-- branchA
            /
...--o--o--*
            \
             o   <-- branchB

Here branchA points to one tip commit, and branchB points to the other.

There's one more special property of a branch name, though. Git allows you to be "on a branch", as git status puts it, by which it means "on one specific branch". Git does this by writing the branch name into a file inside the repository. This file is named HEAD and HEAD has the name of a branch, so HEAD effectively points to the branch name. So to be really accurate, we should draw it like this:

             o   <-- HEAD -> branchA
            /
...--o--o--*
            \
             o   <-- branchB

Now we know that you have two branches, branchA and branchB, and you're currently on branchA. This is also how new commits work: when you make a new commit, Git reads the branch name from HEAD, makes the new commit with its parent set to the current branch tip, then writes the new commit's ID into the branch name. Voila, the branch is now one commit longer—the branch has grown, with a new commit on its right—and the new commit's parent is the previous branch-tip.

This is how the pretend-remove stuff works too. Git says that only commits found by names (references) are immediately visible. But each commit points back to its parent(s), so a visible commit makes its parents visible, and those make their parents visible, and so on.

Every commit that is reachable, by starting from one of these branch names or other references and looking at visible parents, is "still in there". Commits that are not reachable are pretend-removed, and become eligible for garbage collection. (Git also has "reflogs", which occupy a sort of half-way zone between visible and invisible: ghostly references that keep commits reachable, but which git log normally doesn't look at, even with --all. They keep commits from being garbage collected immediately. These reflogs eventually expire and then the commits can go away.)

Remote-tracking branches

Besides regular branches, Git provides something called a "remote-tracking branch". This is almost the same as a regular branch, except for two things:

  • You can't get on it: if you try, you get into what Git calls "detached HEAD" mode.
  • You don't normally update it, or do anything at all with it. You let your Git update it, which your Git does when your Git talks to someone else's Git.

This second item reveals the real point of a remote-tracking branch: it keeps track of some branch on some remote (hence the name remote-tracking branch).

Every now and then, you ask your Git to call up someone else's Git over the Internet-phone and send a lot of texts or snapchats or whatever back and forth. :-) To do so, you provide your Git with a name like origin, which we call a remote. Git stores several items in a configuration for that remote, including the URL needed to call it up, and then uses the name of the remote to remember where their branches were, the last time it had this kind of conversation.

What this means in practice is that you don't have to make two branches of your own. We can get into that same situation when you have a branch, and someone else also has a branch, and you both make commits on your own branches, and then you use a "remote" to pick up their commits:

             o   <-- HEAD -> master
            /
...--o--o--*
            \
             o   <-- origin/master

This is, in fact, precisely how we get into the usual situation where Git suggests doing a merge.

Yay, we made it to merging! :-)

So, what exactly is merging?

The goal of a merge is easy to understand. Several people or groups, or even just one person with two or more tasks, started from a common code base, and made a series of changes. At some point, someone must combine the changes. The combination should take all the good parts of both changes (note the qualifier here: the good parts).

In this case, the common code base is whatever was in the commit we marked with *. Since then, you made some changes, and they made some changes.

Let's say you changed README to describe a new feature, and they changed README to describe a new (but different) feature ... and both of you noticed a typo, and you both made the same fix.

When you merge these changes, you will want one copy of the typo fix (not two copies), and one copy of each of your new feature descriptions. If you are in luck, Git can do this for you.

The way Git tries to do this is, in effect at least, by running two git diffs. One compares commit * to your branch-tip. These are your changes. The second git diff compares commit * to their branch-tip to get their changes.

Git can then look at these two sets of changes and—with luck—figure out that the typo fix is the same change repeated, and take it exactly once. With even more luck, it can see that your new feature description and their new feature description are different but not conflicting, and take one copy of each of those. The result is a correct merge.

Git is not smart, though. It just follows a bunch of simple rules, and if those do not seem to produce a good result, it throws up its metaphorical hands and declares a merge conflict. In this case, the merge stops—does not complete automatically—and Git makes you fix up the mess.

I ... don't see how resolving conflicts can be anything other than changing your files to match that of the files you're merging from.

In many or even most cases, that's true ... but not in all cases.

Let's look at a README type file again (plain text), and let's say their change looks like this:

 some images have red borders and some
-images have lighter borders,
+images have pink borders,
 and the difference is

Let's say further that yours is:

 some images have red borders and some
-images have lighter borders,
+images have more pinkish or even green borders,
 and the difference is
+[some stuff here about special green borders]
 ...

Now you clearly do not want to drop your green border information; you might want to change "more pinkish" to just plain "pink", or maybe not. The point here is that you have to actually think about combining things.

With code, combining code might require some syntactic changes. (There are reasons to make these as separate commits before merging, though in most cases you can just do it directly in the merge.) Or, perhaps they fixed a bug, but you fixed it better, in which case you should drop their changes entirely (perhaps including changes they have made to other files, that Git thinks it has successfully merged). Or perhaps they fixed it better and you should drop your changes!

Again, you have to exercise some judgement. Git usually gets the merge right on its own, but not always, whether or not Git declares a conflict. Combining both sets of changes is usually right, but not always.

In any case, once the merge is done, you (or Git) will commit it. This makes a new merge commit, which Git adds on the end of your current branch in the usual way. But because it is a merge commit, it has two parents, instead of just the one. The first parent is the same as for any ordinary commit. The second parent is the commit you merged, so we can draw that now:

             o
            / \
...--o--o--*   o  <-- HEAD -> master
            \ /
             o   <-- origin/master

The new merge commit links back to your previous branch tip commit (on the top row), but also to their branch tip (on the bottom row).

(Note that you can merge any of your local branches with any other local branch, not necessarily a remote branch. If you spend a week working on one feature in one branch, and another week working on another, you might have two series of commits on two different local branches, that you can profitably merge.)

git push

Last, let's look at the git push case. To get there, we have to backtrack just a bit, because we glossed over how you got origin/master in the first place.

When you run git pull, it really runs two separate commands:

  • git fetch, and then ...
  • git merge (or if you tell it, or configure it, git rebase).

It's the git fetch step that obtains their new commits—whoever "they" are—and updates your origin/master remote-tracking branch so that it now points to their new tip-most commit.

This tip-most commit is the tip of their master branch. In your repository, your Git renames it, so that their master branch does not interfere with yours. The renamed remote-tracking branch is now origin/master because the remote is named origin.1

When you use git push, you send your commits over to them, and then ask them to set their master (or whatever branch name) to point to your tip-most commit. There is no merge step here.

This is the critical bit, which is hard to cover (because it depends on Git DAGs and knowing some graph theory, in particular all the parent pointer stuff). It really is critical though, so again...

Push does not merge

When you run git push, you have your Git call up their Git as usual, but instead of getting their new commits and setting your origin/whatever remote-tracking branches to point to their new branch tips, you now ask them to set their branches—not remote-tracking at all, just ordinary branches—to the branch tips of your choice.

It is up to you to make sure that these branch tips are sensible, but by default, their Git won't take new branch tips that would cause them to "lose" commits.

Let's go back to the graph drawing yet again. Here's what you have, after git fetch, before you do any merging.

             A   <-- HEAD -> master
            /
...--o--o--*
            \
             B   <-- origin/master

Note that your master points to commit A, while their master (your origin/master) points to commit B.

If you tell them to set their master to point to commit A, and they do it, their Git will find the master branch by starting at commit A and working backwards through the graph, to commit *, and then on back to the left side os, and never see commit B anymore. Commit B will appear to be gone: it's pretend-removed, and if there is no reflog, it gets garbage collected and then it's really removed.

If you do a git merge, though, your Git will make a new merge commit. Let's draw that with the letters added:

             A
            / \
...--o--o--*   M  <-- HEAD -> master
            \ /
             B   <-- origin/master

Now you can send them your commit M, and then ask them to set their master to point to commit M, and this won't "lose" commit B because M points to B (as its second-parent).

It's not the git push that caused this merge, though. You made the merge, locally, in your repository, using your master (pointing to commit A) and your origin/master (pointing to commit B). Now that you and your Git have made M, that is suitable for you to push to origin. They won't have to merge, because you did it for them, and they can tell, because their branch-tip is currently still B and M points to B.


1In fact, the full name is refs/remotes/origin/master, which is in a completely separate name-space from local branches. This handles what would otherwise be a disaster on your part if you accidentally create a branch named origin/master. It's still a mess, but because your branches all secretly start with refs/heads/ and theirs all start with refs/remotes/, yours and theirs can never collide.

torek
  • 448,244
  • 59
  • 642
  • 775
0

When git is automatically merging a file, it takes all changes from the remote server and try to mix them with your local changes. Sometimes, (for example if a line is both changed by you and another developer), git is not able to choose the right change to use for you. You have to choose which line is the best.

For example

Your change:

Line 1
Line 2 - My change
Line 3

Change from other on the same file

Line 1
Line 2 - Other change
Line 3

Git can not decide which "Line 2" is best to use. This is your task to do.

You can use the option

--mergetool

to solve conflict. I like to use meld as a tool.

Flows
  • 3,675
  • 3
  • 28
  • 52
-1

When you pull a commit from remote git will want to overwrite files. The only way to merge a pulled commit that has a conflict is to:

git checkout -- <file>

BE CAREFUL because this command will remove all of the local edits so make sure you save the changed file in a different location or copy it to your clipboard.

Andrei Tumbar
  • 490
  • 6
  • 15