First some background info: We are currently in the process of migrating a big git repository from Bitbucket to Azure Devops. There were some challenges because the repository has a history that was full of binary blobs that in hindsight were totally unnecessary.
After previously trying out bfg-repo-cleaner we finally ended up using git filter-repo and successfully trimmed down the repo size from several gigabytes to "just" around 400 megabytes (depending on what you count). We also rewrote some tag names.
Our process was to first make a fresh clone from bitbucket and then run a shell script that shrinks the repo. After that we pushed the repo to a new blank repository that we created in Azure Devops.
This all went way easier than we expected. git filter-repo was blazing fast and the whole process did not take more than an hour.
Before we felt safe doing the movement (and forcing all of our devs to freeze the repo for some time) we did a couple of test runs to make sure we did not loose any data and a Azure Devops pipeline can build our code just as fine as Bamboo used to do.
We successfully made a yml pipleline that roughly took 4 minutes to run in total. Feeling confident that we solved al our problems we proceeded to do the entire process for real. Everything went smooth and we quickly moved all our devs to the new repository.
The problem: Then we noticed that our new pipeline took way longer to build than our previous tests did. After some digging in the logs we found out it had something to do with downloading objects.
New Repo (Checkout takes 8 minutes in total)
remote: Found 39837 objects to send. (1316 ms)
Receiving objects: 100% (39837/39837), 809.66 MiB | 1.69 MiB/s, done.
Test Repo (Checkout takes 31s in total)
remote: Found 11772 objects to send. (358 ms)
Receiving objects: 100% (11772/11772), 80.17 MiB | 8.75 MiB/s, done.
I think it's relevant to mention that we use --depth=1 during the checkout. In our test pipeline this drastically brought down the checkout time.
Now we are at a point that we are happy that everything works and we can say goodbye to a costly VPS hosting both bitbucket and bamboo, but frustrated by longer build times that we are used to.
I suspect that our pack files somehow are not optimized enough so you need to download more of them to "clone" the repo. I say "clone" because the pipeline seems to init a fresh repo, add a remote and fetch. When I do a real clone on my local dev machine it only takes 5 minutes (including transfer over the internet and resolving deltas). I find this very strange.
Any help would be greatly appreciated. Thanks,
Piet Eckhart