2

I have a git repo stored in Bitbucket with the following top-level structure:

  • README.md
  • dir_a
  • dir_b
  • dir_c

Dirs a, b and c have grown to the point where I really want to refactor them into their own repos. After some googling I found the following article from Atlassian: https://confluence.atlassian.com/bitbucket/split-a-repository-in-two-313464964.html.

But when I get to step 7 and run the command:

git filter-branch --index-filter 'git rm --cached -r dir_b dir_c' -- --all

I see the error:

Rewrite 7e0f6b3a2e52696d3ca934021e60095099c0dfd4 (1/60) (0 seconds passed, remaining 0 predicted)    fatal: pathspec 'dir_b' did not match any files
index filter failed: git rm --cached -r dir_b dir_c

What is the cause of this error? How can I split up the repo while maintaining the commit history?

wolfson109
  • 898
  • 6
  • 10
  • Possible duplicate of [Proper way to remove unwanted files with git filter-branch without git rm failing](https://stackoverflow.com/questions/12179611/proper-way-to-remove-unwanted-files-with-git-filter-branch-without-git-rm-failin) – phd Oct 30 '19 at 17:20
  • https://stackoverflow.com/search?q=%5Bgit%5D+filter-branch+index+filter+failed%3A+git+rm – phd Oct 30 '19 at 17:20

1 Answers1

4

TL;DR: add --ignore-unmatch.

Direct cause: git rm --cached will fail if the file does not exist in the index at the time that the git rm runs.

Background: git filter-branch works by copying commits. Precisely how it does this depends on the exact set of filter(s) chosen. You used --index-filter which is one of the simplest (and thus fastest): this one works by reading each commit to be copied into the index, then letting you run some command to alter the index, then making a new commit from whatever you have left in the index after your command.

In this case, you're copying 60 commits (1/60). Git will do this copying in the order required, which in general means starting at the very first commit ever made in the repository. It's likely that this commit does not have any dir_b/* files and/or dir_c/* files. So git rm --cached looks for files whose name starts with dir_b (and in a moment, dir_c) and does not find any such files and aborts with an error.

This stops the filter-branch operation, which leaves the repository unmodified.

The cure in this particular case is to use git rm --cached --ignore-unmatch, which tells git rm that not finding a file to remove is not an error. So now when filter-branch reads out the very first commit—which probably has just a README file or some such—it's OK to leave everything alone and re-commit that first commit. Eventually you'll reach some commits that do have the files; git rm --cached --ignore-unmatch will remove them; and the new replacement commit will be different from the original commit, so that the final tip commit of the branch will be different as well, so that the branch name will refer to the new commits instead of the old ones.

torek
  • 448,244
  • 59
  • 642
  • 775