0

If two users both 'git commit' on their individual laptops at the same second, and later git push to the same git repository, how does git determine which commit came first when running 'git log'?

This post says that git's timestamp resolution is 1 second.

Vahid Pazirandeh
  • 1,552
  • 3
  • 13
  • 29
  • Whoever pushes second will not be able to push and will receive a message telling them to pull first. I do not think the commit timestamp resolution is involved in that – evolutionxbox Sep 08 '21 at 00:21
  • Time is not sensitive to commit sequence in git log. A commit with small timestamp can lay before or after a commit with big timestamp. – LF00 Sep 08 '21 at 00:35

1 Answers1

2

Commit ordering is determined by the commit graph, not by timestamps in the logs. However, the output from git log can be shown in timestamp order, rather than in graph order. So the answer to your question is "yes", or "no", or mu, rather than something actually useful. Here are the tricky bits.

Remember first that every commit has a unique hash ID. This hash ID, which is typically represented as a large hexadecimal number, looks random, is generally not usable by humans, and is in fact a cryptographic checksum of the commit's contents, with all that this implies. It is unique partly because it just will be: theory tells us that there will eventually be a hash collision, but also that it won't occur for billions of years.1 Even if one does occur, Git's answer to that is "you can't add that commit".2 So the hash ID is the commit, in a sense.

Meanwhile, every commit records two things:

  • Each commit has a full snapshot of all files.
  • Each commit has some metadata, such as who made the commit and when: the time stamps that are visible in git log output come out of this metadata, for instance.

The metadata in any one given commit include the hash ID of the previous commit. So this means that the commits are actually ordered by the fact that some later commit records their hash ID. This "later commit records hash ID of earlier commit" provides the commit graph. This commit graph takes the form of a Directed Acyclic Graph: later commits point to earlier ones, and cycles are precluded because an earlier commit cannot contain the hash ID of a later commit.3

What this all boils down to is that it is the DAG itself, not the timestamps, that determine the commit order. In the scenario you described, where users U1 and U2 make two commits "at the same time" from the same starting commit, we end up with a divergence in the graph:

             1
            /
...--o--o--C
            \
             2

Commit 1's parent is commit C, and commit 2's parent is commit C. Commits 1 and 2 are unordered with respect to each other: they are siblings in the graph, so they have no parent/child relationship with each other. Both are children of C (both have C as their parent), so in the partial order established by the graph, C1 and C2, but 1 and 2 are simply unordered.

Now, it's not possible to locate commits 1 and 2 unless:

  • you know their raw hash IDs, or
  • you have names by which to locate them or any of their successors.

That is, if we have two branch names, we can find both commits:

             1   <-- user1
            /
...--o--o--C
            \
             2   <-- user2

Running git log user1 produces a log that shows commit 1, then commit C, then the first of the commits just to the left of C (labeled o here), and so on: commit 2 is not shown at all. Running git log user2 produces a log that shows commit 2, then commit C, then commits to the left of C in the drawing.

Running git log user1 user2, however, inserts both commits 1 and 2 into the to-be-shown queue. It is at this time that git log will pick the "higher priority" commit to show.

The priority of a commit, in this git log priority queue, is controlled by:

  • the committer date-stamp by default, or
  • the author date-stamp if using --author-date, or
  • by other criteria if specified at git log time.

If we posit that the committer and author date-stamps are identical, then the two commits have identical priority in the queue, and one of them will come out first, but there is no guarantee which one. They just come out however they come out.

We cannot find both commits through one name, so before both users can git push these commits, at least one of them will have to do something tricky, such as:

  • create a new branch name, or
  • rebase a commit (copy a commit to a new and improved commit), or
  • merge some commits (add a new commit with two or more parents).

We already know the effect of creating a new branch name, as we saw that above. If one user rebases his or her commit atop the other's, the result is a new and different commit in a new and different graph:

...--o--o--C--1--2'   <-- branch-name

for instance. Here, git log will show commit 2' first, then commit 1, regardless of any date stamps.

If one user merges, we can get this:

             1
            / \
...--o--o--C   M   <-- branch-name
            \ /
             2

Now, if someone runs git log on this branch, they will see commit M first, then commits 1 and 2 in priority-queue order—because viewing commit M pushes both commits into the priority queue—and then commit C as before. The order in the priority queue depends on arguments to git log, just as in the case where we gave it two starting commit hash IDs via two branch names. However, adding --first-parent to git log will cause git log to push only the first parent of commit M into the queue, so that the queue depth is just 1 after visiting commit M. We will therefore see only commit M, then only the first parent (whichever parent that is), then only commit C, and so on: visiting a commit pushes its parents into the queue and --first-parent limits this to just its first parent.


1The actual elapsed time depends on the rate at which we generate hashes, and any progress someone makes on breaking the cryptography, but in practice it's "long enough".

2Neatly short-circuiting the potential problem—the old commit hash ID remains unique—at the cost of "you can't add the new commit".

3To enforce this, the cryptographic checksum of the later commit includes the hexadecimal representation of the earlier commit's hash ID. This means that you have to know what a future commit's hash ID will be, in order to include that hash ID into the current commit; but knowing this and including it changes the hash ID of this commit, which in turn changes the hash ID of that future commit. Unless you can find a fixed point or small cycle, you just can't do it.

torek
  • 448,244
  • 59
  • 642
  • 775