5

I have added a directory of files in my fossil repository, but:

  1. the files contained occupy way more space than I expected
  2. I realized afterward that adding it was completely superfluos.

So now I find myself with a repository one order of magnitude bigger than it needs to be to contain files that were never useful. The whole directory has been included in a single commit, nothing else has been done in that commit, and has never modified since, but I had to do other commits afterward (after getting more confident with fossil, I know that I could have used undo before doing anything else, but at the time I wasn't conscious of the posibility).

The only way I found to do the job is to perform a shun on the data to remove them, but I also found online that this operation can wreak havoc in the database. Given that is a work related repository, I'm concerned about causing damages.

Is there a way to get rid of those files that is safe and will not leave the database in a corrupted/full of warning state?

EnricoGiampieri
  • 5,947
  • 1
  • 27
  • 26

1 Answers1

10

If the bad checkin exists only in your repository (or your repository plus a server) and has not been pulled by other users, the simplest solution is to use fossil purge.

Use fossil purge checkins <tag> to move those checkins to the "graveyard"; the <tag> part can also be the hash of a checkin, not just a symbolic tag. Be aware that if you specify a branch, the entire branch will be purged; even if you don't specify a branch, all descendant of the checkin will be purged (as they depend on it). Once you've confirmed that everything is in order, use fossil purge obliterate to get rid of the graveyard if you need to free up the disk space. If you don't need the disk space, you can let the graveyard sit around for a while until you're certain that everything is okay. Consult fossil help purge for further options.

You may want to keep a backup of the repository (it's just a single file, you can just copy it) for a bit in case something didn't go right.

The shunning mechanism exists only to purge artifacts globally and is meant to be used on a central server as a last resort: it will prevent those artifacts from being propagated anymore to other users via that server. If your changes are local only or if you have access to all the servers and can use fossil purge instead, shunning is unnecessary.


If you actually need to purge something in the middle of a branch, additional steps are required.

  1. Make a backup of the repository file, as you're going to do non-trivial surgery on it.
  2. Use fossil update to move to the checkin just prior to the defective one.
  3. Use fossil merge --cherrypick to copy the first "good" checkin. Do fossil commit --allow-fork to commit the copy of that checkin; the editor should be prepopulated with the original commit message. You will be prompted to confirm that you don't want to change the commit message. Press "y".
  4. Repeat step 3 (fossil merge --cherrypick + fossil commit) for all remaining "good" checkins. You won't need --allow-fork for these.

You should now have a fork with all the checkins that you want to preserve and a separate fork with the bad checkin and the original version of the good ones. Verify the graph in fossil ui to see that everything is in order. Once that is done, use fossil purge to get rid of the bad checkin and its descendants as described above.

The process in steps 3+4 can be automated with a shell script:

#!/bin/sh
set -e
for commit in "$@"; do
  fossil merge --cherrypick "$commit"
  echo yes | VISUAL=true fossil commit --allow-fork
done

Put this in an file, say fossil-replay.sh, make it executable, then use fossil-replay.sh commit1 commit2 ... commitn to replay commit1 through commitn from the current position in the repository. Obviously, replace commit1 etc. with the actual commit hashes.

Reimer Behrends
  • 8,600
  • 15
  • 19
  • correct me if I'm wrong, bu doesn't the purge command remove the checkin al all of its descendant? in that case it would remove everything that has been done until that point, given that is in the main branch... – EnricoGiampieri Mar 16 '17 at 13:44
  • Descendants are checkins that were made after the purged checkins (and are dependent on them), not before. – Reimer Behrends Mar 16 '17 at 13:48
  • yes, and the checkin I want to remove is in the trunk, and several commits have been done after that. They don't modify or touch the data from that checkin, but are in the same branch (the trunk). As much as I acknowledge the error in continuing working on the same branch, I wanted to know if it was possible to nuke only the file from that commit without touching the following ones. – EnricoGiampieri Mar 16 '17 at 14:02
  • Ah, I understand. Then let me update my answer to write out how to deal with that. – Reimer Behrends Mar 16 '17 at 14:05
  • isn't there a simpler way to just get rid of the files, even leaving the commit in place? it seems like that procedure open the possibilities for even more mistakes to be done... – EnricoGiampieri Mar 18 '17 at 11:46
  • 1
    The simpler mechanism is shunning, but that comes with the risk associated with basically punching a hole in the version history (which is also why you want to shun individual files, and not the entire commit). The problem is that each commit potentially depends on any and all previous commits (delta encoding). Thus, if you were to just remove a commit from the history, all subsequent commits need to be re-packaged in order for the delta encoding to be redone properly. I've updated my answer with a shell script to automate the relevant part of the process in the hope that this helps. – Reimer Behrends Mar 18 '17 at 12:43
  • thank you very much, you're being extremely useful. Just for information, can I shun a fie from the command line as well? I only found the interface of the website for that – EnricoGiampieri Mar 18 '17 at 19:13