0

I have created a repository on GitHub with the snapshot of the blockchain database folder that I am updating every day by command:

git pull
git fsck
git prune
git add .
dt=$(date '+%d/%m/%Y');
git commit -a -m "$dt"
git push

This repository later is used for fast synch using the git clone -b main --depth 1 --single-branch https://github.com/myrepository/repository.git command.

The problem is that the .git folder on the server-side that is updating the repository is extremely huge x times more than the actual size of the blockchain folder. Is that any chance to decrease the size apart from removing and re-creating the repository?

Skuld
  • 13
  • 2

1 Answers1

1

Git logically stores a complete snapshot of every file in every commit. However, because doing this in a naïve way would be wildly inefficient, Git uses some techniques to reduce the size of the data, most notably compression and deltification.

It's likely that your data isn't very redundant and therefore won't deltify or compress well; this is especially true for cryptographic outputs, which are supposed to be indistinguishable from random. This means that operations like git gc and git repack will both take a long time and also not be very effective at reducing the size of your data.

Git is not a good tool to use in this case, and you would be better off using a different one better suited to the structure of your data. If your repository continues to grow in this way, it will likely exceed GitHub's maintenance timeout due to its pathological structure and then GitHub will ask you to remove it.

bk2204
  • 64,793
  • 6
  • 84
  • 100
  • Is that any way to keep only the last modified version of the file and delete all previous files including the history and changes that have been kept? – Skuld Jan 29 '22 at 04:35
  • If there is no way to reduce the size, probably the solution could be to delete and recreate the repository with the same name? Is that possible to do it using git? – Skuld Jan 29 '22 at 04:43
  • I found the tool BFG Repo-Cleaner. Can I use this tool to update my repository? – Skuld Jan 29 '22 at 05:00
  • I also found a command `git annex add` but looks like it works only from external sources like Amazon S3 – Skuld Jan 29 '22 at 05:47
  • If you don't want to keep history, then you don't want Git at all. You want a website to host your files or a cloud bucket. – bk2204 Jan 29 '22 at 16:46