34

I have just created a mercurial repo created from a heterogeneous ecosystems of other repos. Before I publish it to my co-workers, I want to clean it as much as possible. To this end, I'd like to entirely remove a few big old files from history (pretend they never existed), so repo will be smaller.

Is this possible with mercurial?

static_rtti
  • 53,760
  • 47
  • 136
  • 192

1 Answers1

36

Check out the convert extension, particularly the --filemap option.

Enable by adding the following to mercurial.ini:

[extensions]
convert =

Create a map of files to exclude:

exclude path/to/file1
exclude path/to/file2

Then convert the repo:

hg convert srcrepo destrepo --filemap <map>

Note there is a bug in Mercurial 2.1.1 causing an error with the above command:

initializing destination destrepo repository
abort: invalid mode ('r') or filename

Just add the --splicemap <nonexistent file> option to fix the problem.

Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251
  • Thanks for your answer! The documentation for --filemap is pretty scarce, would you have a link explaining it with a little more detail? – static_rtti Apr 11 '12 at 13:04
  • I have just tested your method. The files are removed from the working directory, but they still seem to be in .hg/store/data. Do you know how to remove them from there as well? – static_rtti Apr 11 '12 at 13:23
  • @static_rtti, did you look in the destination repo? The files are completely gone in my test. – Mark Tolonen Apr 11 '12 at 13:28
  • OK, I've found out what was going on: some files had been moved around, and I had to put the original file path in the file map as well as the new path. – static_rtti Apr 11 '12 at 13:57
  • As I've thought through what Mercurial supports, it looks to me like the only way to remove *every* binary file from the repository's history is to go through the effort of finding every binary and its renames and adding each one manually to the map. Is that accurate? – Chris Krycho Jun 11 '13 at 15:58
  • 2
    @ChrisKrycho, it could be automated. `hg locate set:binary` will find any binary in the working folder. `hg log -r "file('set:binary() and copied()')"` can find any revision with a binary file present and a copy was made. A little scripting and it wouldn't be painful. – Mark Tolonen Jun 12 '13 at 02:49
  • Ooh, that could be extremely handy. Looks like I'm going to be doing some reading on `hg locate`. Thanks much for the info. – Chris Krycho Jun 12 '13 at 12:16
  • Great tip. I was able to remove personal data from a long ago committed file. BTW: my filemap file had an extra linefeed at the end, which may have caused the first try to not remove the file. I haven't tested that theory though. Thanks! – jetimms Aug 08 '13 at 03:36
  • 5
    Minor note: you can load the convert extension without editing hgrc. If this is a one-off thing (it was for me), I'd rather do it that way. `hg --config=extensions.convert= convert --filemap map.txt old new`. – jpkotta Feb 06 '15 at 17:26