I was able to fix my repos using a combination git-filter-repo and git interactive rebase.
Git-filter-repo took care of deleting and/or renaming files throughout the entire git repo history. An interactive git rebase allowed me to delete a couple leftover commits that were no longer needed. I created a new repo in Github & pushed the new local repo to it. I kept the old Github repo as a backup.
I discovered a few things along the way - hopefully this will help somebody.
1. Installation. Easiest way to install git-filter-repo on a Linux machine: sudo apt install git-filter-repo
2. General paradigm. 'git filter-repo' is acting as - you guessed it - a filter. Loosely speaking, think of it as passing a net through your entire repo history. Only the files caught by the net are operated on. For example, the '--path' option filters by path/filename and keeps only those files that match. They are "caught by the net" so they're the only ones that remain - all others are deleted.
3. Deleting files. Deleting files requires using both the '--path' and the '--invert-paths' options in the same command (i.e. there is no '--delete' option). The '--path' option catches selected files in the net. The '--invert-paths' option inverts what would normally be done with those files. Since they would normally be kept, they are instead deleted. For example, to delete the file 'january.txt' you would use git filter-repo --path january.txt --invert-paths
.
4. Must specify full path. 'git filter-repo' needs the full path, starting just below the repo name. Suppose I want to delete the file 'repo_name/dir1/dir2/january.txt'. If I navigate to dir2 and type git filter-repo --path january.txt --invert-paths
, the file will NOT be deleted. That's because it's only looking under repo_name/ (i.e. it's looking for the file 'repo_name/january.txt' instead of 'repo_name/dir1/dir2/january.txt'). The correct command would be git filter-repo --path dir1/dir2/january.txt --invert-paths
.
5. Can't rename filenames with colons ('file:name.txt') 'git filter-repo' cannot rename a file with a colon (:) in its name. The format of the command is git filter-repo --path-rename <oldname:newname>
. Note that there's a colon in the command itself. When one of your file names also contains a colon, git-filter-repo gives an error saying, "...only one colon expected in argument." This
appears to be an inherent limitation of git-filter-repo, rather than relating to whether you're on Linux or Windows (I was on Linux when I tried it).
6. Wildcards search all directories. Globs/wildcards on 'git filter-repo' operate on the entire history and directory structure of the repo. This is true regardless of 1) the branch you've checked out, 2) which commit you're on, or 3) the directory you're currently in. The first two you probably want but the third has some gotchas. For example, suppose you want to delete all the text files in the directory 'repo_name/dir1/dir2'. If you navigate down to dir2/ and type git filter-repo --path-glob '*.txt' --invert-paths
, you will delete all .txt files throughout your entire git repo's directory structure - not just those in dir2. Instead, you must specify the full path to the files, so you'd need to do git filter-repo --path-glob 'dir1/dir2/*.txt' --invert-paths
. This will delete only the text files in dir2.