¿What is actually the Working directory in Git?

Question

I am spending lots of time to get a clear idea about the 'Working directory in Git' Is it a especific folder or directory? or is a version of a directory? Can anyone help me to understand this concept. What if I create a directory 'mydir' locally then I run: git init. thanks

Does this answer your question? [What's the difference between working directory and local repository?](https://stackoverflow.com/questions/21692155/whats-the-difference-between-working-directory-and-local-repository) — Omar Abdel Bari, Oct 25 '20 at 02:49

torek · Answer 1 · 2020-10-25T02:01:12.797

In Git, the phrase working directory was once a synonym for working tree. It isn't any longer, because the phrase working directory may also be used by your OS (usually with a third word in front, as current working directory). Modern Git tries to use the phrase working tree as much as possible, though this is sometimes shortened to work-tree or worktree, as in git worktree add for instance.

In your OS, when they use the phrase current working directory, this refers to the folder or directory¹ you are working in at the time. That may be within your working tree.

In Git, the phrase working tree refers to the OS-maintained directories-and-files that hold your copies of files. These are yours, to deal with as you wish: Git simply fills them in from committed files.

What if I create a directory 'mydir' locally then I [run]: git init

Let me rephrase this as the following series of shell commands:

$ mkdir mydir
$ cd mydir
$ git init

The mkdir creates a new, empty directory, within your current working directory. The cd then enters this empty directory, so that now what was ./mydir is your current working directory. The git init command runs with its own current working directory being this empty directory.

Since the directory mydir was empty at the time you ran git init, Git will create a hidden directory / folder named .git within this mydir directory. This hidden directory contains the repository proper. The repository consists of a number of files and directories that implement several databases:

One database is a simple key-value store that uses hash IDs to locate internal Git objects. This is the main (and usually largest) of the two primary databases that make up a Git repository.
One database is another simple key-value store that uses names as keys, to store hash IDs, which are then used in the first database. This is the secondary database that makes up a Git repository. This particular database's implementation in current versions of Git tends to be a bit dodgy: it relies too much on your operating system. On macOS and Windows, it tends to be a bit flawed. There is ongoing work in Git to replace this with a proper database implementation, which will eliminate this problem.
Apart from these two main databases, the repository contains many auxiliary files, including Git's index (aka staging area). The most important point here is that all of these entities live within the .git directory, though.

As there are no commits yet, both main databases are empty. At this point, so is Git's index.

Your work-tree consists of all files and directories inside your current working directory except the .git directory, which holds Git's files. Since your work-tree is yours, and is maintained by your OS (not by Git), you can now create any files you like here.

At some point, you will want to have Git create a new commit. This will be the very first commit in the repository. To create this commit, you will add the files you would like to go into this initial commit, into Git's index / staging-area, using git add. The git add program works by copying your work-tree files into Git's index. So, with your OS's current working directory being the mydir directory, you can now just create some file(s):

$ echo "repository for project X" > README
$ git add README
$ git commit

The echo command here creates a new file named README in your working tree. The git add command takes the working tree file, compresses and Git-ifies it to make it ready to be stored in a new commit, and writes the stored file into Git's index.² The final command, git commit, gathers some metadata from you—the person making the commit—and writes out Git's index and this metadata, storing the results in the main database, to create a new commit.

Once you've made this new, initial commit—the very first commit in the repository—it becomes possible for branch names to exist. They cannot exist until this point because each branch name must hold a valid, existing hash ID, and hash IDs for future commits are not predictable.³ Now that there is one commit, that's the only hash ID that any branch name can hold.⁴

Over time, you will add more and more commits to the repository. (In general, it's pretty rare to ever drop a commit, except for, e.g., the way git rebase replaces commits with new-and-improved ones. It's not impossible, it is just difficult.) Each new commit therefore adds to the repository.

The repository itself, then, consists of:

the databases that hold commits and other objects, and the names that find them;
Git's index, used to hold your proposed next commit; and
other maintenance items that you and/or Git may find useful.

The commit objects, and in fact all objects in the big database, are strictly read-only. Nothing and no one can ever change them. They're in a form that is directly useful only to Git itself, though.

Cloning the repository consists of copying the two databases, although the names database is only partly copied, and gets changed during the cloning process.

Meanwhile, your working tree is where you have Git extract commits, turning stuff that's only directly useful to Git—and that is read-only—into stuff you can work with and modify. These are your files. This is how you do your work, in your working tree. You can use the results to update Git's index, and then use Git's index to create a new commit, that adds on to the repository without changing anything that already exists in the repository.

¹At the OS level, the terms folder and directory are synonyms. Git itself does not store folders or directories: it just stores files whose names may contain embedded slashes, such as path/to/file.ext. That's all one single file name. Your OS may force you to first make a folder named path, then in that folder, make a folder named to, and only then use the combined path and to folders to make a file named file.ext within that path. The current working directory can be changed to path, so that you would use the name to/file.ext, instead of path/to/file.ext, or even to path/to so that you would use the name file.ext. In all cases, Git will internally work with a stored file named path/to/file.ext. So your current working directory is an OS concept, referring to how you move around within the folders that your OS maintains.

²Technically, the index doesn't actually hold the files directly. It holds instead a Git blob object hash ID for the file, which provides the key to the key-value object database so that Git can look up the file's content, plus the name of the file—complete with (forward) slashes—and some additional information. The blob object holds a compressed and de-duplicated copy of the file's content.

This de-duplication, and the fact that it is git add that readies the file for committing, means that git commit will go quite fast, as it need not prepare anything for committing: it just saves, permanently, the blob objects already stored in the index.

³The hash ID of a commit is a cryptographic checksum of the commit's complete content. The content include not only the saved source files (as an internal Git tree object), but also the exact date-and-time-stamp. Since we don't even know what you'll commit in the future, much less exactly when you will commit it, we cannot compute what the future hash ID will be. You may know what you will commit, which gets you closer; but unless you know exactly when you will commit it, you won't know the hash ID either.

⁴Branch names in particular are constrained: they may only hold a commit hash ID. Tag names can hold the hash ID of any of Git's four internal object types. (Usually, though, a tag name either holds a commit hash ID, or the hash ID of a newly-created annotated tag object, which in turn holds a commit hash ID.) Other types of names may have their own constraints.

Thanks for your help torek. It is clear for me now. – perci Nov 03 '20 at 01:39 — perci, Nov 03 '20 at 01:39

¿What is actually the Working directory in Git?

1 Answers1