39

I cloned a git repo and noticed a status line Filtering content which was very slow. This doesn't usually appear. What is it?

remote: Enumerating objects: 30, done.
remote: Counting objects: 100% (30/30), done.
remote: Compressing objects: 100% (26/26), done.
remote: Total 16592 (delta 6), reused 9 (delta 4), pack-reused 16562
Receiving objects: 100% (16592/16592), 14.14 MiB | 1.01 MiB/s, done.
Resolving deltas: 100% (7529/7529), done.
Checking out files: 100% (11475/11475), done.
Filtering content:   6% (115/1729), 390.32 MiB | 1.12 MiB/s
Drew Noakes
  • 300,895
  • 165
  • 679
  • 742

2 Answers2

54

In git you can define "filters" that affect the process of moving files from the index to the work tree ("smudge" filters) and from the work tree to the index ("clean" filters). Typically you'll find a .gitattribute file that associates the filters with files at specific paths.

It used to be that this was always handled file by file during checkout or add operations. It can be more efficient to handle all of the "smudge' filters for a checkout in a more batched manner, and git added support for that relatively recently.

The use case that (I believe) drove that addition is called LFS. With LFS, large content is stored in a secondary repo, with small placeholders ("pointer files") replacing them in the core repo. The "smudge" filter downloads the real content and puts it in place of the pointer file. This is most likely what your repo is doing, and it can be a lengthy process.

In general, though, the 'filtering' status line just means that a batch of smudge filters is being run on the checked-out cotent.

Mark Adelsberger
  • 42,148
  • 4
  • 35
  • 52
9

The repo is using Git LFS, which is a git extension for versioning large files alongside a git repository.

https://git-lfs.github.com/ https://github.com/git-lfs/git-lfs/

Drew Noakes
  • 300,895
  • 165
  • 679
  • 742
  • 4
    That's probably true; LFS is the only commonly-used example of filtering that's likely to be long-running. But that isn't what the `Filtering content` line "means" per se, and other tools *could* have the same behavior. – Mark Adelsberger Nov 15 '18 at 14:54
  • 1
    Thanks for the answer and comment. It's not great that it does not make it obvious that LFS may be using the network when filtering. It explains why it's reporting an absurdly low throughput to me. – Steven Lu Jul 31 '20 at 13:15