8

I've got a) a working directory without the .git directory and b) a repository. a is some revision in the middle of the history of b.

How can I find out, which revision a matches in b?

I thought of a shellscript doing a diff from the working directory to all revisions and pick the one with the least (hopefully 0) differences.

That would be a bit raw (and I'm not sure how to do it), is there an easier way?

fabb
  • 11,660
  • 13
  • 67
  • 111

4 Answers4

4

You could write a script to run diff gitdir workdir | wc -c for each commit. Then you could collate the results and say the commit which has the smallest difference (as measured by wc -c) is the closest commit to the bare working dir.

Here is what it might look like in Python:

find_closest_sha1.py:

#!/usr/bin/env python
import subprocess
import shlex
import sys
import os
import operator

gitdir,workdir=map(os.path.realpath,sys.argv[1:3])
os.chdir(gitdir)
proc=subprocess.Popen(shlex.split('git rev-list --all'),stdout=subprocess.PIPE)
shas,err=proc.communicate()
shas=shas.split()
head=shas[0]
data={}
for sha1 in shas:
    subprocess.Popen(shlex.split('git checkout {s}'.format(s=sha1)),
                          stderr=open('/dev/null')).wait()
    proc=subprocess.Popen(shlex.split('diff {g} {w}'.format(g=gitdir,w=workdir)),
                          stdout=subprocess.PIPE)
    out,err=proc.communicate()
    distance=len(out)
    data[sha1]=distance
answer=min(data.items(),key=operator.itemgetter(1))[0]
print('closest match: {s}'.format(s=answer))
subprocess.Popen(shlex.split('git checkout {h}'.format(h=head)),
                 stderr=open('/dev/null')).wait()

Example:

% rsync -a gitdir/ workdir/
% cd workdir
% git checkout HEAD~10
HEAD is now at b9fcebf... fix foo

% cd ..
% /bin/rm -rf workdir/.git
% find_closest_sha1.py gitdir workdir
closest match: b9fcebfb170785c19390ebb4a9076d11350ade79
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
1

You could pare down the number of revisions you have to check with the pickaxe. Diff your working directory against the latest revision, and select some differing line that looks as rare as possible. Say your latest revision has a line containing foobar but your work directory does not; run git log -Sfoobar which outputs all commits adding or removing foobar. You can now move your repository back to the first (latest) revision on that list, since all of the revisions after that one are going to be different from your work directory. Repeat with another difference until you find the correct revision.

Jouni K. Seppänen
  • 43,139
  • 5
  • 71
  • 100
1

Since git uses a content-addressible file store, it should be possible to find an arbitrary tree in there somewhere, but I don't know the details. I'm guessing you could copy over the files from the detached work directory into the repository's work directory, then commit everything, somehow find out the hash of the tree object created by the commit and search the existing commits for one that references the same tree.

For this to work, the tree will obviously need to match perfectly, so you must not get any non-tracked files into the commit (such as object files, editor backups, etc).

Edit: I just tried this on one repository (with git cat-file commit HEAD to show the tree object at HEAD, and searching the output of git log --pretty=raw for that tree hash), and it didn't work (I didn't find the hash in the history). I did get a bunch of warnings about CRLF conversion when I did the commit, so that might have been the problem, i.e. you probably get different hashes for the same tree depending on how your git is configured to mangle text files. I'm marking this answer community wiki in case someone knows how to do this reliably.

Jouni K. Seppänen
  • 43,139
  • 5
  • 71
  • 100
0

Assuming that the in-tree and b/.git ignore settings are as they were when the commit was created and that there aren't any non-ignored untracked files in the working tree you should be able to run something like this.

The strategy is to recreate the git id of the working tree and then search for any commit that contains this tree.

# work from detached working tree
cd a

# Use existing repository and a temporary index file
GIT_DIR=b/.git
GIT_INDEX_FILE=/tmp/tmp-index
export GIT_DIR GIT_INDEX_FILE

# find out the id of the current working tree
git add . &&
tree_id=$(git write-tree) &&
rm /tmp/tmp-index

# find a commit that matches the tree
for commit in $(git rev-list --all)
do
    if test "$tree_id" = "$(git rev-parse ${commit}^{tree})"; then
        git show "$commit"
        break
    fi
done

unset GIT_DIR
unset GIT_INDEX_FILE
CB Bailey
  • 755,051
  • 104
  • 632
  • 656