I have been tasked with migrating our Team Foundation Server (TFS) repositories into the agency GitHub Enterprise (GHE) and keeping the entire changelog intact. I am using the git-tfs tool with the following syntax to create a local copy of the primary source branch:
git tfs clone --all --with-labels <server>:8080/tfs/ $/<branch>
The process takes about 30 hours and when that completes I have a directory structure of ~45 GB that contains a ~6 GB .git repository sub-structure. When I attempt to push this to our agency GHE I get errors regarding large files, because the agency doesn't have Large File Storage enabled and has no plans to enable it.
I have brought this to the attention of my superiors and been instructed to "remove the large files and make the upload." I ran an audit of all files >20 MB as instructed and have a spreadsheet I can copy/paste into Notepad++ for scripting the removal process.
I have attempted a git rm
and then a git commit -m
on the larger files, but am learning that this doesn't work as the changelog still tracks the large files. The git push
to GHE command simply threw back the same errors I was seeing before.
My research has led me to several solutions, such as BFG Repo-cleaner and git filter-repo. Both tools require a --mirror copy of the repository, which git-tfs doesn't support. Git-tfs only supports a --bare option and the documentation for git clone doesn't help me understand the difference. I understand that both are just the repository directory and not the raw file structure, but not much more. I also do not understand how to push a mirrored local copy that doesn't have a file structure into GHE.
I've raised these issues to my leadership and been instructed to:
git-tfs clone
TFS to localgit clone --mirror
the local copy to a secondary local copy- Attempt to run BFG or git-filter-repo against the secondary copy
- ???? [I don't know what comes after this]
I'm unclear on several things.
- Doesn't the mirrored secondary still point to the TFS as origin?
- Do I have to push the secondary local to the primary local and then push the primary local to GHE, as the secondary has no file structure?
- How do I perform an audit of the changelog to see what was modified to ensure that history is preserved? I don't want to be punished 6 months or a year from now because the developers are looking for a specific change and can't find it.