1

Overview

I have Git repo which serves as an archive for a number of configs sourced from elsewhere:

  • 200 dirs
  • 100 files per dir
  • 10K plain text per file
  • ~1000 commits per day, usually < 10 lines edited
  • 1 branch
  • 1 user handles all commits
  • all other view repo on read-only basis

Before anyone suggest I try breaking up this repo into smaller ones, that's not an option because customers.

The repo contains ~2 years worth of data but our customers only need last 90 days.

Process

I have successfully grafted the root onto a commit 90 days ago using the method described here:

  • git checkout -b newroot xyz_90_days_old_rev
  • git reset abc_original_root_rev
  • git add .
  • git commit --amend -m 'purge history'
  • git checkout master
  • git rebase --onto newroot xyz_90_days_old_rev

The problem is using git-filter-branch cleanup afterwards - it takes > 24 hours which is unacceptable downtime to users.

I would like to try bfg-repo-cleaner instead, but it's not clear to me:

  • does it support this use case?
  • does it work on a non-bare repo?

PS: I am now aware that git checkout --orphan would have been slightly more elegant, but it doesn't really change the problem that BFG requires the repo be bare while checkout requires it not be bare

333kenshin
  • 1,995
  • 12
  • 17

1 Answers1

0

Don't bother rewriting the repo. If someone only needs the last 90 days, run a script to estimate the required depth and have them create a shallow clone with --depth <depth>.

BFG should be run on a mirror/bare clone.

javabrett
  • 7,020
  • 4
  • 51
  • 73