I am writing some automation to update a git submodule to the latest on the remote. Normally this is achieved by specifying branch=<blah>
in the .gitmodules
and running:
git submodule update --remote
But for me this is not suitable.
The main repository I am dealing with over 6 gigs in size, and the submodule is over 10 gigs. Do not ask me why, this is the way it is. When one runs git submodule update --remote --init
, git performs a full checkout of the remote branch, which takes forever. I want to update the submodule pointer in the main repository to the latest without checking out the entire submodule. I also want to avoid checking out the entire main repository if possible. So I devised a solution using sparse checkouts and manually cloning the submodule:
1) Perform a sparse checkout of the main repository with just .gitmodules
git clone -b <branch> --no-checkout --depth=1 <url>
git config core.sparsecheckout true
echo .gitmodules > .git\info\sparse-checkout
git checkout <topic-branch>
2) Clone the submodule manually with
git submodule init
git clone --no-checkout --depth=1 -b <submodule-branch> <submodule-url> <submodule-path>
At this point I would expect a git status to report a difference in the submodule commit, since it is now different than the checked in version of the submodule. Then I could simple add the submodule path, commit, and create pull request. However, using this sparse checkout method, the main repository is unaware of change I made in the submodule. This method works if I do an ordinary checkout of the main repository instead of sparse, but then I need to checkout the entire main repository as well. Is this some limitation of sparse checkout w/ relation to git submodules?
Thanks!