9

I've been trying to figure out how to get a list of all files in a git repo including those contained within submodules. Currently, git ls-files will provide the top level submodule directory but not the files contained within the submodule. On further investigation, I found that using git submodule, you can recursively find all of the submodules and then go a git ls-files using:

git submodule --quiet foreach --recursive "git ls-files"

The only problem with this is that the results are the path from the submodule but I need the full path from the repo. So for the following

e.g. /some/path/to/gitrepo/source/submodule/[file1, file2]

What I see is:

file1
file2

What I would like to see is:

source/submodule/file1
source/submodule/file2

Is there a way to do this? From the documentation, there are some pre-defined variables ($name, $path, $sha1 and $toplevel) but I'm not sure how to use these to get the desired results.

Sheldon
  • 91
  • 1
  • 3
  • 1
    Note: you will have `git ls-files --recurse-submodules` with Git 2.11+ (Q4 2016). See [my answer below](http://stackoverflow.com/a/40311391/6309). It can be run from the main parent repo and will produce the full path. – VonC Oct 28 '16 at 18:46

2 Answers2

12

Another approach is possible with Git 2.11+ (Q4 2016)

git ls-files --recurse-submodules

See commit 75a6315, commit 07c01b9, commit e77aa33, commit 74866d7 (07 Oct 2016) by Brandon Williams (mbrandonw).
(Merged by Junio C Hamano -- gitster -- in commit 1c2b1f7, 26 Oct 2016)

ls-files: optionally recurse into submodules

"git ls-files" learned "--recurse-submodules" option that can be used to get a listing of tracked files across submodules (i.e. this only works with "--cached" option, not for listing untracked or ignored files).

This would be a useful tool to sit on the upstream side of a pipe that is read with xargs to work on all working tree files from the top-level superproject.

As shown in this test, the output would include the full path of the file, starting from the main parent repo.

The git ls-files documentation now includes:

--recurse-submodules

Recursively calls ls-files on each submodule in the repository.
Currently there is only support for the --cached mode.


Git 2.13 (Q2 2017) adds to the ls-files --recurse-submodules robustness:

See commit 2cfe66a, commit 2e5d650 (13 Apr 2017) by Jacob Keller (jacob-keller).
(Merged by Junio C Hamano -- gitster -- in commit 2d646e3, 24 Apr 2017)

ls-files: fix recurse-submodules with nested submodules

Since commit e77aa33 ("ls-files: optionally recurse into submodules", 2016-10-07, git 2.11) ls-files has known how to recurse into submodules when displaying files.

Unfortunately this fails for certain cases, including when nesting more than one submodule, called from within a submodule that itself has submodules, or when the GIT_DIR environemnt variable is set.

Prior to commit b58a68c ("setup: allow for prefix to be passed to git commands", 2017-03-17, git 2.13-rc0) this resulted in an error indicating that --prefix and --super-prefix were incompatible.

After this commit, instead, the process loops forever with a GIT_DIR set to the parent and continuously reads the parent submodule files and recursing forever.

Fix this by preparing the environment properly for submodules when setting up the child process. This is similar to how other commands such as grep behave.


As noted with Git 2.29 (Q4 2020), the config submodule.recurse would not work.

See commit 7d15fdb (04 Oct 2020) by Philippe Blain (phil-blain).
(Merged by Junio C Hamano -- gitster -- in commit 9d19e17, 05 Oct 2020)

gitsubmodules doc: invoke 'ls-files' with '--recurse-submodules'

Signed-off-by: Philippe Blain

git ls-files(man) was never taught to respect the submodule.recurse configuration variable, and it is too late now to change that, but still the command is mentioned in 'gitsubmodules(7)' as if it does respect that config.

Adjust the call in 'gitsubmodules(7)' by calling 'ls-files' with the '--recurse-submodules' option.

gitsubmodules now includes in its man page:

git ls-files --recurse-submodules

[NOTE]
git ls-files also requires its own --recurse-submodules flag.


With Git 2.36 (Q2 2022), git ls-files --stage --recurse-submodule is also supported.


With Git 2.40 (Q1 2023), stop using git --super-prefix and narrow the scope of its use to the submodule--helper.

See commit 4002ec3, commit f5a6be9, commit 04f1fab, commit 99a32d8, commit 677c981, commit bb61a96, commit f0a5e5a, commit 49eb1d3 (20 Dec 2022) by Ævar Arnfjörð Bjarmason (avar).
See commit 0d1806e (20 Dec 2022) by Glen Choo (chooglen).
(Merged by Junio C Hamano -- gitster -- in commit d4c5400, 05 Jan 2023)

read-tree: add "--super-prefix" option, eliminate global

Signed-off-by: Ævar Arnfjörð Bjarmason

The "--super-prefix" option to "git" was initially added in commit 74866d7 ("git: make super-prefix option", 2016-10-07, Git v2.11.0-rc0 -- merge listed in batch #11) for:

  • use with "ls-files" (commit e77aa33 ("ls-files: optionally recurse into submodules", 2016-10-07, Git v2.11.0-rc0 -- merge listed in batch #11)), and shortly thereafter
  • "submodule--helper" (commit 89c8626 ("submodule helper: support super prefix", 2016-12-08, Git v2.12.0-rc0 -- merge listed in batch #5)) and
  • "grep" (0281e48 ("grep: optionally recurse into submodules", 2016-12-16, Git v2.12.0-rc0 -- merge listed in batch #6)).

It wasn't until commit 3d41542 ("unpack-trees: support super-prefix option", 2017-01-17, Git v2.12.0-rc0 -- merge) that "read-tree" made use of it.

At the time it made sense, but since then we've made "ls-files" recurse in-process in commit 188dce1 ("ls-files: use repository object", 2017-06-22, Git v2.14.0-rc0 -- merge listed in batch #14), "grep" in commit f9ee2fc ("grep: recurse in-process using 'struct repository'", 2017-08-02, Git v2.15.0-rc0 -- merge listed in batch #2), and finally "submodule--helper" in the preceding commits.

Let's also remove it from "read-tree", which allows us to remove the option to "git" itself.

We can do this because the only remaining user of it is the submodule API, which will now invoke "read-tree" with its new "--super-prefix" option.
It will only do so when the "submodule_move_head()" function is called.

That "submodule_move_head()" function was then only invoked by "read-tree" itself, but now rather than setting an environment variable to pass "--super-prefix" between cmd_read_tree() we: - Set a new "super_prefix" in "struct unpack_trees_options".

git now includes in its man page:

[--config-env==] []

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
4

Take a look at the git submodule documentation, which says:

foreach

Evaluates an arbitrary shell command in each checked out submodule. The command has access to the variables $name, $path, $sha1 and $toplevel: $name is the name of the relevant submodule section in .gitmodules, $path is the name of the submodule directory relative to the superproject, $sha1 is the commit as recorded in the superproject, and $toplevel is the absolute path to the top-level of the superproject.

Given the above information, you can do something like:

git submodule foreach 'git ls-files | sed "s|^|$path/|"'

In this example, we're simply taking the output from git ls-files in a submodule and using sed to prepend the value of $path, which is the path of the submodule relative to the parent project's toplevel directory.

larsks
  • 277,717
  • 41
  • 399
  • 399
  • 1
    This is technically wrong if $path contains '|' (or whatever other character you use for separating sed). You can make it extra robust like this: `git submodule --quiet foreach 'export path;bash -c '\''git ls-files | sed "s/^/${path/\//\\/}\//"'\'` – cdleonard Dec 21 '17 at 13:21
  • Despite the problematic sed trick I like this answer. pointing me to 'git submodule foreach'. I did [ git submodule foreach 'git ls-files --others --exclude-standard' ] to list files that are ready to be added. – grenix May 18 '21 at 13:25