5

I'm using JGit for one of my project that involves intesive use of git.

My aim is to use a RevWalk to be able to iterate over the commits in a repository in a chronological order, starting at a specifc commit. I've managed to achieve both of them separetely:

  • Chronological order by applying a RevSort.REVERSE
  • Starting point by calling RevWalk.markStart(RevCommit c)

My problem is that when I try to combine the two, it seems that the RevSort overrides the markStart, and the RevWalk always ends up starting at the beginning of the commits instea of the commit that I've specifiied.

This snippet shows what I've got:

import org.eclipse.jgit.lib.Repository;
import org.eclipse.jgit.internal.storage.file.FileRepository;
import org.eclipse.jgit.revwalk.RevWalk;
import org.eclipse.jgit.revwalk.RevCommit;
import org.eclipse.jgit.revwalk.RevSort;

import java.io.IOException;
import org.eclipse.jgit.errors.AmbiguousObjectException;
import org.eclipse.jgit.errors.MissingObjectException;

public class Main {

    public static void main(String[] args) throws IOException, AmbiguousObjectException, MissingObjectException {
        final String repositoryPath = args[0];
        final String commitID = args[1];
        final Repository repository = new FileRepository(repositoryPath + "/.git");
        final RevWalk walk = new RevWalk(repository);
        walk.sort(RevSort.REVERSE);
        walk.markStart(walk.parseCommit(repository.resolve(commitID)));
        for (final RevCommit revCommit : walk) {
            System.err.println(revCommit.getId());
        }
    }

}

This should prints the ID of the repository in reverse order starting at the commit specified, but it just ignore the second setting and starts from the initial commit.

UPDATE:

I've investigated more in the problem and it turns out that when applying the two options together (in any order), the markStart becomes a markStop. I think that this is caused by markStart being always executed first and limiting the range of the commits (with a filter), and then having those reversed by the RevSort. Basically, the RevWalk is iterating on the complementary set of commits that I'm interested in.

Should I assume that what I'm trying to do is not obtainable in this way? I couldn't think of another way to get it without traversing the whole repository up to my starting point, but that sounds highly inefficient.

UPDATE 2: To give a proper example here is what I was expecting to achieve. Assume that we have a repository containing 4 commits: A, B, C and D. I'm interested only in the comments from B to the current revision, excluding A, in a chronological order. I was hoping to be able to use markStart and sort to achieve that in the following way:

@Test
public void testReverse2() throws Exception {
    final RevCommit commitA = this.git.commit().setMessage("Commit A").call();
    final RevCommit commitB = this.git.commit().setMessage("Commit B").call();
    final RevCommit commitC = this.git.commit().setMessage("Commit C").call();
    final RevCommit commitD = this.git.commit().setMessage("Commit D").call();

    final RevWalk revWalk = new RevWalk(this.git.getRepository());
    revWalk.markStart(revWalk.parseCommit(commitB));
    revWalk.sort(RevSort.REVERSE);

    assertEquals(commitB, revWalk.next());
    assertEquals(commitC, revWalk.next());
    assertEquals(commitD, revWalk.next());
    assertNull(revWalk.next());
    revWalk.close();
}

Now, from what I've seen, this doesn't work because markStart is always executed before the sort, so the actual behaviour satisfies the following test:

assertEquals(commitA, revWalk.next());
assertEquals(commitB, revWalk.next());
assertNull(revWalk.next());

That is the opposite of what I'm trying to obtain. Is this an intented behaviour and, if so, in what other way could I approach the problem?

vivianig
  • 173
  • 11

2 Answers2

2

In Git, commits have only links to their parent(s). commitB does not know about its successors commitC and commitD.

Hence a history can only be traverse backwards, from a given commit to its parent, grand-pareents, etc. There is no information to traverse in the opposite direction.

In your example the RevWalk will walk from commitB to commitÀ. The REVERSE sort will only affect how the iterator will behave but cannot walk forward.

If you actually want to find the commits between commitB and HEAD, you will need to start at HEAD. Or, more general, you would need to start from all known branch tips to find the possibly multiple paths that lead to commitB.

Rüdiger Herrmann
  • 20,512
  • 11
  • 62
  • 79
  • 1
    Assuming that I'm interested in only a single branch, what would be the best way to go from `commitB` to the `HEAD` (maintaining a chronological order)? My current approach ended up being the use of `sort` and just doing a whole walk from the start, manually identifying where I arrive at the point that I wanted to set as `start`. But this feels quite of inefficient, is there any other approach? – vivianig Aug 06 '15 at 21:21
  • 1
    Reconsider how commits reference each other in Git. There is no other approach: start at head (i.e.the point that you know `commitB` can be reached from) and walk until you find the desired commit. I don't see why this is inefficient. – Rüdiger Herrmann Aug 07 '15 at 09:09
0

The JGit API gives no contraindication of combining sort and markStart. The JGit source code shows no surface issues either. In my opinion, fixing this directly requires source-level debugging. You'll need the JGit source and need to run your example in a debugger.

Alternatively you can Stream the RevWalk with a Spliterator without the sort to a sorted output comparing on RevCommit:getCommitTime() as follows:

StreamSupport.stream(walk.spliterator())
    .sorted(RevCommit::getCommitTime())
    .toList();
Alain O'Dea
  • 21,033
  • 1
  • 58
  • 84
  • Yes, I've done a deeper investigation on the source and updated the original post. – vivianig Aug 06 '15 at 16:56
  • @vivianig thank you for the update. I think you may have to accept the costly approach. Will my traversal visit more of the Git repository than the sort? It may wind up being equivalent. How much time does each method take (without the markStart)? My solution above should use the walk after calling markStart to reflect your intent directly. – Alain O'Dea Aug 06 '15 at 17:11
  • It would probably end up being equivalent since it still has to run through the whole repository to get to the given point. My problem is the way that `markStart` works, and I don't think that changing the way of iterating over the filtered set of commits would help. I'll loook more into it later. – vivianig Aug 06 '15 at 21:17