4

I have a directory containing a small git repository.

git status, and gitk --all, show no uncommitted changes.

If I tar/compress this directory with:

tar czf git-repo.tar.gz git-repo/

Then transfer this tar file to a test directory and untar with:

tar xzf git-repo.tar.gz

When I

cd test/git-repo/ gitk --all

gitk shows a line at the top of all the commits (with the usual red dot):

"Local uncommitted changes, not checked in to index"

git status still shows:

On branch master nothing to commit, working tree clean

Testing for differences with:

diff -r git-repo/ test/git-repo/

Shows no differences.

If I run git clean, this line goes away.

I've tried git clean -i in hopes of seeing an offending filename, however it completes without asking for interactive confirmation of a clean. And thereafter gitk shows no uncommitted changes.

I am running: git version 2.11.0

It's just a little complicated to explain to the client "Why are there uncommitted changes?" when there actually aren't.

Thank you for any insight...

johnea
  • 43
  • 4
  • Don't copy a git repository (and working directory). https://stackoverflow.com/questions/49894500/git-wants-to-add-already-tracked-files/49896284#49896284 – Edward Thomson Apr 23 '18 at 21:55
  • All of that seems to have to do with moving to windoze or mac. Not a factor here. All machines are linux. This issue actually arose on the same machine, just making sure the tar contents looked good before carrying it to the client. – johnea Apr 23 '18 at 22:13
  • On second thought, I have to add, just saying "Don't do that" really isn't helpful. If you don't have any insight into why gitk might be doing this, just don't answer. – johnea Apr 23 '18 at 22:16
  • I would be happy to expand on my answer as to why you shouldn't do that. – Edward Thomson Apr 30 '18 at 09:50

2 Answers2

6

gitk is looking at the cache information in the index to determine whether your working directory is dirty or not. The index stores information about the state of the current working directory, so that it does not need to analyze the files.

When you run git status, it will compare the contents of HEAD to the contents of the index, to show staged changes. This is simple and quick; if the file's ID is different, then its contents must be different. However, there is a more costly computation to determine if a file has unstaged changes. The file must have its SHA1 computed, and then compared to the value in the index.

To avoid this costly computation, git caches the struct stat information about the working directory contents in the index:

README.md
  ctime: 1516120578:638662531
  mtime: 1516120578:638662531
  dev: 16777220 ino: 1752439
  uid: 501      gid: 20
  size: 13224   flags: 0

Now, when you run git status, it can just stat the contents of the working directory. If any file has the same size, inode, ctime, mtime, etc, then git assumes that the file has not changed. This allows git status to stay performant when you have unchanged files. But if any file has a different value, then it will hash the file. If the file has the same hash (ie, you've simply run touch on the file without changing the contents) then the index will be updated with the new cache information. If you've actually changed the file, then git status will report the unstaged change.

gitk however does not bother to hash the file to determine whether it has truly changed. You can see this yourself with a trivial example. Here I have a repository with one file, foo, with no changes.

enter image description here

If, on the command-line, I touch the file, updating its timestamp:

% touch foo

Now, gitk reports my repository as having uncommitted changes:

enter image description here

However, if I run git status again on the command line, it will update the cache information in the index, and now gitk will understand that there really aren't any unstaged changes:

enter image description here

When you untarred your repository - with the working directory - you are putting on-disk a working directory that doesn't matched the cache information in the index. git would actually rehash the contents to determine that your working directory is not, in fact, dirty, but gitk does not.

It is generally not a good idea to copy a git repository and working directory; generally speaking, you should check out a new working directory instead.

Edward Thomson
  • 74,857
  • 14
  • 158
  • 187
0

While Edward's answer explains the issue, I want to offer a potential work-around.

gitk has a helpful --argscmd flag that runs on every refresh. Assuming a git status doesn't take too long (side note: git gc may speed it up), you can tell gitk to run a git status on every refresh, by running it like so:

gitk --argscmd="git status >/dev/null"

This should hide the issue until gitk implements file status checking that's more in line with git's.

jonny
  • 4,264
  • 4
  • 22
  • 29