Is there a way in mercurial to remove old changesets from a database? I have a repository that is 60GB and that makes it pretty painful to do a clone. I would like to trim off everything before a certain date and put the huge database away to collect dust.
-
1How did it get that big to begin with? – Santa Apr 23 '10 at 01:43
-
12If Jake has any binary files (which are sometimes necessary... not all binary files are generated by source), then every small change to that file results in a new copy made in the repo. Depending on the size of the file or the frequency of changes, something like 60GB might not take so long. – bobpaul Mar 30 '11 at 15:44
2 Answers
There is no simple / recommended way of doing this directly to an existing repository.
You can however "convert" your mercurial repo to a new mercurial repo and choose a revision from where to include the history onwards via the convert.hg.startrev option
hg convert --config convert.hg.startrev=1234 <source-repository> <new-repository-name>
The new repo will contain everything from the original repo minus the history previous to the starting revision.
Caveat: The new repo will have completely new changeset IDs, i.e. it is in no way related to the original repo. After creating the new repo every developer has to clone the new repo and delete their clones from the original repo.
I used this to cleanup old repos used internally within our company - combined with the --filemap option to remove unwanted files too.

- 1,374
- 10
- 14
-
So how is that solution different from just removing the .hg/ subdirectory and doing hg init? – grzaks Aug 21 '12 at 22:33
-
1Sorry, I don't understand your comment. What does the local .hg dir and an hg init have to do with removing changesets from a repository? – Gerd Klima Aug 23 '12 at 09:44
-
7@Gaks, your approach destroys the entire repository and creates a new one, Gerd's approach selectively destroys the repository from a certain revision back. Both approaches are useful depending on your circumstances. Gerd's approach is useful to me right now as I'm about to make a private repository public and want to keep the last month's revisions, but not anything before that. – lsh Oct 12 '12 at 05:25
-
@Gaks your approach will also only keep the current branch where as Gerds approach will keep them. – Mark Broadhurst Nov 25 '14 at 11:52
-
4This need the `convert` extension to be enabled: [extensions] hgext.convert= – nicodemus13 Dec 03 '14 at 11:02
You can do it, but in doing so you invalidate all the clones out there, so it's generally not wise to do unless you're working entirely alone.
Every changeset in mercurial is uniquely identified by a hashcode, which is a combination of (among other things) the source code changes, metadata, and the hashes of its one or two parents. Those parents need to exist in the repo all the way back to the start of the project. (Not having that restriction would be having shallow-clones, which aren't available (yet)).
If you're okay with changing the hashes of the newer changesets (which again breaks all the clones out there in the wild) you can do so with the commands;
hg export -o 'changeset-%r.patch' 400:tip # changesets 400 through the end for example
cd /elsewhere
hg init newrepo
cd newrepo
hg import /path/to/the/patches/*.patch
You'll probably have to do a little work to handle merge changesets, but that's the general idea.
One could also do it using hg convert
with type hg
as both the source and the destination types, and using a splicemap
, but that's probably more involved yet.
The larger question is, how do you type up 60GB of source code, or were you adding generated files against all advice. :)

- 7,729
- 4
- 49
- 65

- 78,112
- 7
- 148
- 169
-
-
I am importing from another source control system a project that includes generated binaries. We're trying to get the dlls out of source control right now. Thanks for the help! – Jake Pearson Apr 21 '10 at 21:11
-
3The answer contains a typo. The filename should contain lowercase %r (zero-padded numbers) otherwise the files won't get processed in the right order when you import them. – Gili Sep 07 '10 at 18:14
-
I found it helpful to import patches first into mercurial queues using `hg qimport`, and then pushing one by one with `qpush`, fixing eventual problems and finally committing everything with `hg qfinish -a` – mrucci May 26 '11 at 01:23
-
In my experience `hg convert` retains more information than `hg export` (no need to merge changesets). Gerd's answer is easier than using `splicemap` and is probably more reliable than `export`: http://stackoverflow.com/a/8819813/14731 – Gili Aug 27 '13 at 02:41