0

We use git to store our PC board designs. We are a small company and the only person doing commits & pushes is the board designer (me), so there's no risk of overwriting someone else's work by rewriting git history.

We recently had to change our OS from Linux (Ubuntu) to Windows. Several files in our repos violate Windows filename rules (e.g. contain a colon, path too long, etc.). I sometimes need to check out previous commits to see the design history, so ideally the entire history needs to contain valid Windows filenames, otherwise the checkout fails.

Doing an interactive git rebase seemed like it could solve the problem. Unfortunately git doesn't handle things the way I was hoping.

For example, I have a file called file:1.txt that was renamed file:2.txt several commits later. Because this file wasn't critical, I decided to delete it using a git interactive rebase. I deleted it it from the first commit where it appeared. I thought git would know that it had been deleted & update the subsequent commits accordingly. Unfortunately that's not what happened. Rebase got to the subsequent commit where the file was renamed, and gave a conflict because it was trying to change the name of a file that no longer existed.

A few other similar issues popped up in my attempts with other files/paths, but the basic problem seemed to be that a git interactive rebase doesn't always handle filename changes or file deletions well. It doesn't seem to make all the necessary adjustments to the git history automatically.

Am I missing something here? Is there a better way to do this?

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
svk
  • 11
  • 1
  • It should have asked you whether you wanted to "keep" the deleted file or reintroduce the file with the new name. You could have chosen "deleted". – mkrieger1 Jul 15 '23 at 13:40
  • 1
    Don't misuse interactive rebase in this way. Reaching back into history to eliminate a particular individual file is a very tricky business (as you've discovered). It would be better to use a dedicated tool such as `git filter-repo`. It is unclear what you're actually question is; obviously you _are_ missing something here, because what happened is exactly what you should have expected given a sufficient understanding of Git. – matt Jul 15 '23 at 13:43

3 Answers3

0

Git does not record renames. git mv is more or less a shortcut for mv followed by git rm and git add. Git commands that detect a rename do so by heuristics, which can often fail (also, you don't want it to guess wrong). Thus, as far as git history is concerned, 1.txt stopped existing at some point, and 2.txt was newly created.

git rebase will perform rename heuristics, but maybe the files were too different. See the --find-renames option, but be aware of the possibility of false positives.

So - if git rebase did detect a rename, removing 1.txt should also remove 2.txt. If it did not, you will have to remove both files separately (or tweak the rename threshold, but I probably would not recommend it).

Amadan
  • 191,408
  • 23
  • 240
  • 301
0

As matt mentioned in a comment, usually the easiest way to change history across a repo is to use git-filter-repo. In your case you can use the option --path-rename as described here.

Since you'll be working on Windows, you may find this answer useful for installing it.

TTT
  • 22,611
  • 8
  • 63
  • 69
  • I've never tried using `--path-rename` with invalid characters on the target OS. It's likely that it will still work if only the old filenames are invalid on Windows. If by chance it still doesn't work on Windows, then you could run `git-filter-repo` on Linux before moving it over to Windows. – TTT Jul 15 '23 at 16:36
0

I was able to fix my repos using a combination git-filter-repo and git interactive rebase.

Git-filter-repo took care of deleting and/or renaming files throughout the entire git repo history. An interactive git rebase allowed me to delete a couple leftover commits that were no longer needed. I created a new repo in Github & pushed the new local repo to it. I kept the old Github repo as a backup.

I discovered a few things along the way - hopefully this will help somebody.

1. Installation. Easiest way to install git-filter-repo on a Linux machine: sudo apt install git-filter-repo

2. General paradigm. 'git filter-repo' is acting as - you guessed it - a filter. Loosely speaking, think of it as passing a net through your entire repo history. Only the files caught by the net are operated on. For example, the '--path' option filters by path/filename and keeps only those files that match. They are "caught by the net" so they're the only ones that remain - all others are deleted.

3. Deleting files. Deleting files requires using both the '--path' and the '--invert-paths' options in the same command (i.e. there is no '--delete' option). The '--path' option catches selected files in the net. The '--invert-paths' option inverts what would normally be done with those files. Since they would normally be kept, they are instead deleted. For example, to delete the file 'january.txt' you would use git filter-repo --path january.txt --invert-paths.

4. Must specify full path. 'git filter-repo' needs the full path, starting just below the repo name. Suppose I want to delete the file 'repo_name/dir1/dir2/january.txt'. If I navigate to dir2 and type git filter-repo --path january.txt --invert-paths, the file will NOT be deleted. That's because it's only looking under repo_name/ (i.e. it's looking for the file 'repo_name/january.txt' instead of 'repo_name/dir1/dir2/january.txt'). The correct command would be git filter-repo --path dir1/dir2/january.txt --invert-paths.

5. Can't rename filenames with colons ('file:name.txt') 'git filter-repo' cannot rename a file with a colon (:) in its name. The format of the command is git filter-repo --path-rename <oldname:newname>. Note that there's a colon in the command itself. When one of your file names also contains a colon, git-filter-repo gives an error saying, "...only one colon expected in argument." This appears to be an inherent limitation of git-filter-repo, rather than relating to whether you're on Linux or Windows (I was on Linux when I tried it).

6. Wildcards search all directories. Globs/wildcards on 'git filter-repo' operate on the entire history and directory structure of the repo. This is true regardless of 1) the branch you've checked out, 2) which commit you're on, or 3) the directory you're currently in. The first two you probably want but the third has some gotchas. For example, suppose you want to delete all the text files in the directory 'repo_name/dir1/dir2'. If you navigate down to dir2/ and type git filter-repo --path-glob '*.txt' --invert-paths, you will delete all .txt files throughout your entire git repo's directory structure - not just those in dir2. Instead, you must specify the full path to the files, so you'd need to do git filter-repo --path-glob 'dir1/dir2/*.txt' --invert-paths. This will delete only the text files in dir2.

svk
  • 11
  • 1