I am trying to perform a git pull from outside the directory ...
Don't do that. (Why are you doing that?)
git --git-dir=$WORKDIR/sources/.git pull
In a comment, ElpieKay suggests:
Add --work-tree=$WORKDIR/sources
to which you reply:
Seems to be working now
When you use these two options together, Git will:
- relocate itself to your working tree,
$WORKDIR/sources
;
- use
$WORKDIR/sources/.git
as the repository proper; and
- run the Git command there.
This pair of options is—aside from overriding any environment $GIT_WORK_TREE
and $GIT_DIR
variables—the same as doing:
git -C $WORKDIR/sources pull
where -C
causes Git to change directories first, before running the git pull
. This would be a more typical way to run the git pull
from this location: the variant with separate --git-dir
and --work-tree
options lets you separate the .git
from the working tree, in the relatively rare case where that's useful, and overrides any earlier environment variable settings.1
1Git itself sets these same environment variables when you use these options. So:
git --git-dir=A --work-tree=B whatever
works exactly the same as:
GIT_DIR=A GIT_WORK_TREE=B git whatever
except that the latter form assumes a POSIX-style shell (command line interpreter).
Further reading (optional)
can you provide any insight on why this option is needed?
A Git repository is really just the stuff in the .git
directory. A repository consists primarily of two databases:
One—usually much larger—consists of Git's commits and other internal objects. These are numbered, and retrieved, by their hash IDs. Commit hash IDs in particular are always unique, and to retrieve a commit, Git needs to know its number.
The other database consists of names, such as branch and tag names, each of which maps to one hash ID number. This allows you to use a branch or tag name to retrieve a commit. Adding new commits usually involves updating a branch name, or a remote-tracking name, so as to be able to find the new commits.
Each commit contains a full snapshot of every file, as a sort of read-only archive. Git compresses these snapshots (in multiple ways) and de-duplicates otherwise-duplicated files within the snapshots, so this takes remarkably little space in many cases. This compression technique is not perfect—it tends to fail in the presence of many large, pre-compressed binary files—but it's very good for human-readable files, such as source code for software.
Commits themselves also contain the commit number—or numbers, plural, for merge commits—of their immediate predecessor commit(s). Thus, as long as Git can find the last commits (via branch or other names), Git can use those to find every earlier commit, one step at a time, through backwards-looking chains. This is the history in the repository: History = commits, starting at the end and working backwards.
But there is a catch. These read-only, compressed commits/files are only usable by Git itself. The main thing you can do with these is exchange them with some other Git. You can also use them with git diff
or similar, and do some analysis of how the project has changed over time, and so on. But you can't make any progress. You cannot get new work done, with just the repository made up of the two database.
To get work done, you will need a working tree. Here, you will have Git extract a commit. The extracted commit has the archived (compressed, read-only, Git-ified, useless) files expanded out into their normal everyday (useful) form. We call this a checked out branch or commit. (When and whether we call it a "branch" vs a "commit" is the source of a lot of confusion, because humans are inconsistent here, and sometimes ignorant of the fine details as well. The tricky part is that when we have a branch checked out, we also have a commit checked out. When we have only a commit checked out—via what Git calls a "detached HEAD"—we still have a commit checked out, but not a branch.)
A non-bare repository is defined as a repository that has a working tree. A bare repository is one that lacks a working tree. That's basically all there is to this, but the lack of a working tree means that this repository can receive new commits unconditionally. A Git repository that has a working tree cannot safely receive new commits for the checked out branch. So server (hosting) sites generally store bare repositories.
The git pull
command means: First, run git fetch
, then run a second Git command to do something with new commits obtained by the git fetch
step. The fetch step works fine in a bare repository, but the second command—regardless of which one you pick for git pull
to run—needs a working tree.
When you run Git commands, if you have a working tree, you are expected to run them within the working tree. If there is some good reason that you want to run git
from outside that working tree, you can use git -C working tree path
, as shown above. You can also use $GIT_WORK_TREE
or the --work-tree
argument.
Furthermore, when you do have a working tree and are not using all these complicated methods to separate the repository proper from the working tree, Git expects a .git
directory (or file) to exist in the top level. In fact, in the absence of all these fancy separate-the-two-parts tricks, this is how Git finds the top level of the working tree. Let's say you are in:
/path/to/some/working/dir/ectory
Git will look to see if .git
exists in this path. If not, Git will take the ectory
part off and try again: is there a /path/to/some/working/dir/.git
? If not, Git will take the dir
part off and try again: is there a /path/to/some/working/.git
? If so, Git has found the top level of your working tree, and the .git
here—whether it is a file, containing the location of the .git
directory, or directory itself and thus being the .git
directory—determines where the repository itself resides.
In your case, though, you ran:
git --git-dir=$WORKDIR/sources/.git ...
This tells Git: the Git directory is (whatever $WORKDIR
expands to—this expansion is done by the shell, not by Git)/sources/.git
. So Git does not have to search for the top level. You did not tell Git where the top level of the working tree was, so it just assumed that your current directory was the top level of the working tree. But in fact, your current directory was something else. Git may thus have damaged various files on the theory that they were from your working tree.
You may be partially rescued by the fact that Git also stores something Git calls its index (or staging area or cache, depending on which part of Git is doing this "calling"). This is actually just a file inside the repository directory, .git/index
for instance. (The exact location of the file can vary and sometimes there are additional files, so don't count too much on this one path. Remember .git/index
though if it helps to have a concrete model for what "the index" is.) In this index, Git stores information about which files it has checked out. The presence of the index, plus the assumption that you're already in the top level of your working tree, is why git --git-dir=<path> pull
is not behaving correctly.