9

So here's what's happened:

  1. Accidentally committed lots of files that weren't meant to be.
  2. Did a git reset --soft HEAD~2 to get back to a commit before the accident
  3. Modified gitignore to ignore the files
  4. Commited again and pushed to origin.

I assumed the git reset would revers everything from the accidental commit, but after checking bitbucket's list of git lfs files, it seems all the lfs tracked files from the accidental commit were pushed to lfs in origin. These files do not exist if I look through the source in bitbucket.

So I tried doing git lfs prune which appeared to delete an amount of files that looks to be about the amount that was accidentally commited, then git lfs push origin master. Checked bitbucket's list of git lfs files again, but those files are still there and nothing's changed in origin.

What have I done wrong?

Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
Arvin
  • 1,391
  • 4
  • 19
  • 33

3 Answers3

14

There doesn't appear to be a standard way of doing this:

The Git LFS command-line client doesn't support pruning files from the server, so how you delete them depends on your hosting provider.

Bitbucket allows you to delete LFS files using its web UI (please read the entire linked page before proceeding):

Delete individual LFS files from your repository

It's important to understand that:

  • The delete operation described here is destructive – there's no way to recover the LFS files referenced by the deleted LFS pointer files (it's not like the git remove command!) – so you'll want to back up the LFS files first.
  • Deleting an LFS file only deletes it from the remote storage. All reference pointers stored in your Git repo will remain.
  • No branch, tag or revision will be able to reference the LFS files in future. If you attempt to check out a branch, tag or revision that includes a pointer file referencing a deleted LFS file, you'll get a download error and the check out will fail.

A repository admin can delete Git LFS files from a repo as follows:

  1. Go to the Settings page for the repo and click Git LFS to view the list of all LFS files in that repo.
  2. Delete the LFS files using the actions menu.

Surprisingly, the only way to remove LFS files from GitHub appears to be to delete and recreate the repository, losing issues, stars, forks, and possibly other data.

Community
  • 1
  • 1
ChrisGPT was on strike
  • 127,765
  • 105
  • 273
  • 257
  • 2
    Thanks! I found that page not long after I posted the question and did started using bitbucket's LFS management tool to delete the accidental files, however since I have to `git log` every file I want to delete to ensure it has no references from any commits, it became clear that this was going to be a very long, manual process. I accidentally committed easily hundreds of files. Sitting here deleting these files 1 by 1 is really not a great solution. I really hope there will be better tools for managing LFS in future – Arvin Aug 28 '17 at 20:38
3

In the initial steps you followed, I think you've just stumbled on one of the cases where git / git-lfs integration isn't always perfectly seamless.

The reset command would have moved your branch ref back. It would not have actually removed the unwanted commit (or related objects); but that normally wouldn't matter, because those objects are unreachable so would not be sent with a push. So far so good... with vanilla git.

BUT: The LFS objects (the real content of the large files) also weren't deleted prior to your push. AFAIK (and your experience seems to confirm this) LFS does not attempt to determine if LFS objects are reachable when pushing to the remote - which would, after all, seem to be an expensive check. Given that your LFS store is meant to house a bunch of large binary files, and that LFS is designed to mitigate the costs of having a large volume of unneeded data in the LFS store, the cost-benefit would usually favor just sending anything that's not on the server - which is what apparently happened here.

And unless you're facing a limit on physical storage on the server, that may be ok really. No fetch or pull - short of explicitly telling LFS to send you everything, which is not intended for normal usage - is going to cause those files to be downloaded anyway.

But maybe you're running into a storage limit with your repo host. Or maybe you just want them gone; I can't say I'd blame you. That deleting the files locally and pushing does not result in the files being removed from the server is, again, by design. (The same is true of core git objects; you can force-push a ref to make a remote object unreachable, but physically "cleaning up" the remote is independent of any local clean-up.)

Info on removing LFS files from a bitbucket-hosted repo can be found here: https://www.atlassian.com/git/tutorials/git-lfs#deleting-remote-files

Mark Adelsberger
  • 42,148
  • 4
  • 35
  • 52
  • I am actually running out of storage in my bitbucket repo which is the reason I was concerned with space in the first place. My understanding of vanilla git principles is that git has a garbage collector that runs periodically and removes any objects which does not have any references to it anymore. The LFS files most certainly do not have any commits referring to it, so by git principles, those files should be automatically removed, right? I suppose since LFS was designed independent from git itself, the same principles don't always apply – Arvin Aug 28 '17 at 20:34
  • It is correct that `git gc` should eventually clean up unreachable git objects, but also beware that how this works on hosted remotes varies with the hosting service. In this sense, `git lfs prune` is similar to `gc` - it tries to clean up what's unusable locally, but how the clean-up works server side is not so straightforward. – Mark Adelsberger Aug 28 '17 at 21:22
3

For BitBucket users, I have a solution for this, that works for me for months already: https://gist.github.com/danielgindi/db0e0a897d8d920f23e155bb5d59e9c6

You basically open Chrome while in the bitbucket repo and logged in, and put that piece of code in the console. It uses your authorization to go and delete all LFS files older than the specified time, and it takes a few seconds.

Important note: Never run any piece of code in the browser blindly. Look at the code, make sure you understand what it does. I can tell you "trust me", but you don't know me.

daniel.gindi
  • 3,457
  • 1
  • 30
  • 36