4

Background: I'm working on a pre-commit tool. As a developer it can be quite frustrating to pull another developer's branch and have the pre-commit hook complain loudly about files that I haven't even touched.

What I'd like to be able to do in my implementation is in a merge-commit case only run the hooks on files that were either conflicting or manually edited by me locally during the merge conflict.

What I've tried so far:

  • git diff --staged - This is what the current solution does and isn't correct because it contains all of the files including the ones that merged cleanly.
  • git diff MERGE_HEAD - This is pretty close, but if the branch I am merging in branched from master after I did, this contains all of the changes from master that I haven't yet merged.
  • .git/MERGE_MSG contains the list of conflicting files. This seems like a good starting point but does not contain locally edited files.
  • After committing, git show --name-only gets me exactly what I want. But that's too late (I'm implementing pre-commit after all :D)
bukzor
  • 37,539
  • 11
  • 77
  • 111
anthony sottile
  • 61,815
  • 15
  • 148
  • 207

2 Answers2

4

When you git add to resolve a conflict, you erase the record of the conflict. So to preserve this information without having to reconstruct it, preserve the index file:

cp .git/index .git/preserved-merge-index

and then

GIT_INDEX_FILE=.git/preserved-merge-index git ls-files --unmerged

will show you the conflicts, and

GIT_INDEX_FILE=.git/preserved-merge-index git diff-files --name-only

will show you everything in your work tree that's changed since it was recorded in the merge index.

From comments, you also might be adding files directly in the merge. To catch those, after you've resolved all your merge conflicts you can

GIT_INDEX_FILE=.git/preserved-merge-index git diff-index --name-only `git write-tree`
jthill
  • 55,082
  • 5
  • 77
  • 137
  • This results in the same thing (in my little test case), and seems simpler. Mucking with .git/ files is a minus though. – bukzor Apr 07 '14 at 19:14
  • Enh. "[full access to internals.](https://www.kernel.org/pub/software/scm/git/docs/git.html#_description)" `MERGE_HEAD` is a .git/ file too. `git stash` isn't built to do what you're winkling it into doing. wrapping a layer of commands meant for other things around the mucking just gets in the way. – jthill Apr 07 '14 at 19:19
  • (to be more specific: `GIT_INDEX_FILE` is a published interface. git's _meant_ to be used this way, and its internals are so simple and so robust that it rewards it. the command set isn't an abstraction, it's a toolkit. the repository structure is what the tools are there to help you work with, but there's no law or reason not to just do what needs doing) – jthill Apr 07 '14 at 19:33
  • Actually... I went back and tried this and could not reproduce the behaviour I wanted. Here's my simple test and output: http://paste.pound-python.org/show/ZJNTQ5doduH6HiMWOVB5/ http://paste.pound-python.org/show/eV3ZwpEibL3mZsitqi8A/ I expect `conflict_file` and `commit_introduced_file` in the output. – anthony sottile Apr 15 '14 at 04:28
  • You need to preserve the index file immediately after the merge, so work you've done since shows up as differences. [Fixing that works](http://paste.pound-python.org/show/6GNWNug9vr67CUIYI9KY/) for the scenario in your question (conflicts and gratuitous edits), but it doesn't catch files gratuitously _added_ in the merge commit. I've added something to catch that too. – jthill Apr 15 '14 at 14:56
  • How can I preserve the index immediately after the merge if I am running as a pre-commit? I don't think I have access to that. – anthony sottile Apr 15 '14 at 15:34
  • That's definitely too late, just save it before doing any conflict resolution. – jthill Apr 15 '14 at 15:56
  • +1 Thanks for showing me `git write-tree`, I ended up going with bukzor's approach and `git write-tree`: `git diff -m \`git write-tree\` HEAD MERGE_HEAD --name-only` because I don't have the ability to save an index at the point I would need to. – anthony sottile Apr 15 '14 at 16:26
  • How do you tell from the output of that whether changes to a file that had changes in both parents were conflicting? – jthill Apr 15 '14 at 16:39
  • See comments here: https://github.com/pre-commit/pre-commit/blob/3ebf976afbadc74882e51c27475594d9f24444ce/pre_commit/git.py#L37,L51 – anthony sottile Apr 15 '14 at 16:50
2

I believe the solution is git diff -m. I found the doc on this very confusing, so here's my summary. Given the command git diff -m child parent1 parent2 .... you'll see a multi-parent diff that shows how to get from each parent to the child. parent1 represented in the first column of [ +-] and so on. The major roadblock here is that the child in your question has no referenceable name. git write-tree comes to the rescue here; it creates a name for the currently-staged files, and prints it out.

Note that write-tree will fail if there are any unmerged files, which is probably what you want, but you'll need to make sure your system does something intelligible in that case.

$ CURRENTLY_ADDED=`git write-tree`
$ git diff -m $CURRENTLY_ADDED HEAD MERGE_HEAD
diff --cc README
index 2ef4a65,8e2e466..be3d46e
--- a/README
+++ b/README
@@@ -1,10 -1,5 +1,10 @@@
 -deleted only in <theirs>
 +added only in <theirs>
- deleted only in <ours>
+ added only in <ours>
--deleted during merge
++added during merge
bukzor
  • 37,539
  • 11
  • 77
  • 111