0

I had created a private repository on gitlab over a year ago and over time the repo has grown a fair bit. One mistake in hindsight from me was to put some large binary files in the repository. I did not know about git-lfsand the repo has now grown quite substantially.

So what I thought might be a possible approach would be:

  • Remove the files along with their history from the entire repository (on all branches as well.
  • Enable git-lfs (again somehow on all branches).
  • Add these files (again on all the branches).

The situation is that I have quite a few active branches on the repository. Is there a way to do this somehow across all branches with a minimal set of commands?

Another way, of course, is to archive this repo and then start from scratch in a brand new repository and enable git-lfs and add everything manually. However, given the number of branches, this again seems tedious.

Luca
  • 10,458
  • 24
  • 107
  • 234
  • 1
    https://rtyley.github.io/bfg-repo-cleaner/ will assist with the deletion aspect. – Oliver Charlesworth Dec 27 '18 at 14:18
  • 2
    Note that files don't have history, in Git. Instead, commits have files, and commits *are* history. You will be writing new commits to replace all (or at least most of) the existing commits; the new commits will omit the file and will, instead, use the git-lfs mechanisms to store references to the LFS-stored files. This really does mean rebuilding the entire repository. If a specialized tool exists, that will (presumably) be faster than doing it by starting from scratch, but in general that's how it will work. – torek Dec 27 '18 at 14:28
  • 1
    Your two ways are equivalent, commit histories are the unit of currency in Git, repo boundaries are ephemeral. You're going to make history without those files, it doesn't matter what repo it's in. You can do the whole thing by installing git-lfs and running a filter-branch to construct the corrected history. – jthill Dec 27 '18 at 14:33

1 Answers1

3

The git lfs import command will do this automatically. It will rewrite every branch to remove the large files:

git lfs migrate import --everything

Will locate all large files in history and rewrite every branch to set hem as LFS objects instead of keeping the large files directly in Git.

If you want to identify the large files to import:

git lfs migrate import --everything --include='*.dat'

Will migrate all files ending in .dat, regardless of size.

I’d encourage you to plan this migration carefully if there are several people working in your repository, since ultimately you’ll need to force push the branch(es) you rewrite.

Edward Thomson
  • 74,857
  • 14
  • 158
  • 187