-1

I'm rather new to submodules and git, but I have been using them to include all of my plugins for tmux and vim, etc on my own dotfiles repository on github.

It doesn't happen that often, but sometimes when I do a pull on my dotfiles repository, many of my submodule files have changed. For example, during my most recent git fetch, I get something like this (removed a couple plugin updates to make it shorter):

remote: Enumerating objects: 32, done.
remote: Counting objects: 100% (32/32), done.
remote: Compressing objects: 100% (8/8), done.
remote: Total 21 (delta 11), reused 20 (delta 10), pack-reused 0
Unpacking objects: 100% (21/21), 2.65 KiB | 14.00 KiB/s, done.
From https://github.com/someone/dotfiles
   fb997fe..2bffa27  master     -> origin/master
Fetching submodule tmux/plugins/tmux-yank
From https://github.com/tmux-plugins/tmux-yank
   d776f4e..1b1a436  master     -> origin/master
Fetching submodule vim/bundle/ultisnips
From https://github.com/SirVer/ultisnips
   7941f98..d3b36cd  master     -> origin/master
Fetching submodule vim/bundle/vimtex
From https://github.com/lervag/vimtex
   9b53bb31..49eab5d5  master     -> origin/master
 * [new tag]           v1.4       -> v1.4

During merge:

herophant:~/.dotfiles$ git merge
Updating fb997fe..2bffa27
Fast-forward
 .gitmodules                   |  9 +++++++++
 bashrc                        |  6 ++++++
 tmux/plugins/tmux-yank        |  2 +-
 vim/UltiSnips/tex.snippets    |  8 ++++++++
 vim/bundle/rainbow            |  1 +
 vim/bundle/syntastic          |  1 +
 vim/bundle/ultisnips          |  2 +-
 vim/bundle/vim-airline        |  2 +-
 vim/bundle/vim-airline-themes |  2 +-
 vim/bundle/vim-fugitive       |  2 +-
 vim/bundle/vim-racket         |  1 +
 vim/bundle/vimtex             |  2 +-
 vim/ftplugin/tex.vim          |  1 +
 vimrc                         | 20 ++++++++++++++++++--
 14 files changed, 51 insertions(+), 8 deletions(-)
 create mode 160000 vim/bundle/rainbow
 create mode 160000 vim/bundle/syntastic
 create mode 160000 vim/bundle/vim-racket
 create mode 100644 vim/ftplugin/tex.vim

Now my git status (short form) says:

## master...origin/master
 M .gitmodules
 M tmux/plugins/tmux-yank
 M vim/bundle/ultisnips
 M vim/bundle/vim-airline
 M vim/bundle/vim-airline-themes
 M vim/bundle/vim-fugitive
 M vim/bundle/vimtex

I don't really care what's happening to the submodules, but if they're getting updated, that's probably a good thing. I just want them to do this silently so that I don't have to make another commit to account for this. Is there a way to achieve this?

Edit: Even if I commit this change to the raw hash, commiting one raw commit hash on one machine results in other machines "modifying" those commit hashes to their previous versions after a git pull.

As an example, vim-airline-themes was modified yesterday. I'm not sure what happened, but doing right after doing a git commit, vim-airline-themes is shown as modified. OK, I'll do another commit. This is what was changed

me@main-machine:~/.dotfiles$ git diff 4577802~ 4577802
diff --git a/vim/bundle/vim-airline-themes b/vim/bundle/vim-airline-themes
index e1b0d9f..7f53ebc 160000
--- a/vim/bundle/vim-airline-themes
+++ b/vim/bundle/vim-airline-themes
@@ -1 +1 @@
-Subproject commit e1b0d9f86cf89e84b15c459683fd72730e51a054
+Subproject commit 7f53ebc8f7af2fd7e6a0a31106b99491e01cd18f

I commit, do some more git pulls just to be sure, and "no new changes". I go to my other machine, do a git pull from my dotfiles repo, and see that the /vim/bundle/vim-airline-themes submodule has been modified. What was changed?

me@other-machine:~/.dotfiles$ git diff vim/bundle/vim-airline-themes
diff --git a/vim/bundle/vim-airline-themes b/vim/bundle/vim-airline-themes
index 7f53ebc..e1b0d9f 160000
--- a/vim/bundle/vim-airline-themes
+++ b/vim/bundle/vim-airline-themes
@@ -1 +1 @@
-Subproject commit 7f53ebc8f7af2fd7e6a0a31106b99491e01cd18f
+Subproject commit e1b0d9f86cf89e84b15c459683fd72730e51a054

Git seems to want to undo it's own changes for some reason? This all seems very redundant, and I'm sure this can be prevented in some way. What is the solution?

herophant
  • 642
  • 7
  • 16

1 Answers1

3

The short answer is no. More specifically:

I just want them to do this silently so that I don't have to make another commit to account for this.

You really do have to make a new commit! A submodule is another Git repository, and as such, it's implemented in two parts:

  • First (and in many ways much less important), you'll have a file named .gitmodules in the top level of the superproject's work-tree. This file goes into each commit, and it stores the information a new superproject-clone needs in order to run its own git clone of each submodule.

  • Second—and the reason you need to make a new commit—each commit stores a data-pair, consisting of:

    • the path of the submodule, and
    • the raw commit hash ID to be used in the submodule.

The superproject Git uses those hash IDs to know what to git checkout in each submodule. A submodule repository is controlled by its superproject, by the superproject Git doing a:

(cd $path && git checkout $hash)

to get a detached-HEAD in the submodule at the specified commit. The $path and $hash come from the superproject commit.

The fact that each of the various submodules got updated is not the trigger here. The fact that you (presumably) want to use the latest commit in the submodule is the trigger: you need to record this fact in your superproject. To do so requires making a new commit in the superproject Git.


Edit: as noted in comments below, it can become pretty tricky to decide exactly which commit(s) you want from which submodule repository/ies. I personally dislike git pull but it does have a feature that can be useful here: you can have it recursively enter each submodule and run another git pull in each of those. That's a multi-edged sword though, as you might not want a pull per se.

The git submodule command, which you can run in the superproject, has a number of operating modes as well: git submodule update --remote has your superproject run a git fetch in each submodule followed by a git checkout in each submodule. Compare this with git submodule update without --remote, which will run a git fetch in each submodule followed by a git checkout in each submodule. There's an obvious (but good) question this brings up: what, precisely, is the difference, given that we just said both do the same thing? The answer lies in which hash ID the git checkout uses. The one without --remote uses the hash ID that the superproject asks for. The one with --remote uses the hash ID that goes with some remote-tracking name in the submodule in question, as updated by the git fetch run in that submodule.

With so many repositories involved—there's one for the superproject, one for the remote of the superproject, one for each submodule, and one remote for each submodule, all of which are being used here—it gets confusing. Add on to that that each submodule can be a superproject for one or more submodules of its own, and each of these operations can be recursive, or not, and it's quite a nightmare.

This sort of thing is one of the lingering reasons that submodules are sometimes called "sob-modules". Many of the most painful aspects of submodules are ... well, I won't say fixed now, but less painful than they were ten years ago. But Git is already inherently tricky due to being distributed, and submodules just make this exponentially worse.

There's no single cure for all of this: it really is a hard problem, and you just have to put a lot of skull-sweat into it. If the authors of your various submodules coordinate their work, that can help.

torek
  • 448,244
  • 59
  • 642
  • 775
  • Thank you for the answer @torek. I think I am understanding all of this a bit better, but this seems to be happening on all my machines and sometimes git seems to want to undo it's own changes on another machine (with respect to submodules that it is in charge of)? Could you explain why this would be happening? I've edited my question. – herophant Oct 09 '20 at 12:10
  • Things can get pretty tricky here because there are so many Git repositories involved. Remember that the superproject *only* has path-and-hash-ID for any given submodule, and that `git pull` means *run `git fetch`, then run a second Git command* and the *second* Git command may or may not take updates to submodules. You can also set pull to be recursive, or not: if set to recursive, it does a `git fetch` and second Git command in each submodule as well, so now we've got a lot of moving parts! – torek Oct 09 '20 at 12:52
  • So: suppose some superproject is updated and the new commit you've picked up from the other superproject repository has a new commit hash ID for some old path. Suppose further that you *don't* have recursive pull going on, so that your actual submodules (in your own clones) *aren't* updated. But the superproject is now calling for hash IDs that you may not even have in the submodules. You can use `git submodule update` to have the superproject run `git fetch` and then `git checkout` in each submodule. This requires a lot of reading of the `git submodule` documentation, as it's ... complicated. – torek Oct 09 '20 at 12:55
  • Or, perhaps your superproject *doesn't* call for new commits in the submodules. If you would like it to do so, you'll again want to run `git submodule update --remote` and this time follow that up with superproject-side `git add` operations. – torek Oct 09 '20 at 12:57
  • For the the peculiar case that I edited into my question, it just seemed to go back and forth. i.e., commiting in my other machine, would lead to "changes" when I `pull`ed on my main machine in the superproject. Committing on my main machine, lead to changes after I pull on my other machine, which is just a never ending loop.. I just gave up and did `rm -rf vim-airline-themes` and did a `git checkout vim-airline-themes` haha. Perhaps sob-modules are a bit troublesome, but it's probably better than not having them. I don't think mine implement recursively though, that would be a nightmare.. – herophant Oct 11 '20 at 14:43
  • @herophant: it could be worse (than just simple recursion), if you have a submodule that winds up referring back to itself. That's like ClearCase's "cspecs" with include directives, which is an even-worse nightmare. – torek Oct 11 '20 at 15:38
  • Is there instead a way to "only" update changes to the superproject and leave alone changes or commits to the submodules, so I can just manually update some of them once every 2 or 3 months? I make changes to my dotfiles quite often, and would like to do this "asynchronously". Can I modify the behaviour of `git pull` not to go into my submodules? Or any other command that achieves the same effect? – herophant Oct 11 '20 at 18:33
  • Well, `git pull` = `git fetch` + command#2. Command#2 is your choice: rebase or merge. It's just that, as a convenience command, pull tries to be convenient and also recurse into submodules (depending on settings). So use the "less convenient" separate commands, which also never automatically recurse. It's odd how these "less convenient" commands are actually *more* convenient... :-) – torek Oct 11 '20 at 19:25