3

The Llama2 7B model on huggingface (meta-llama/Llama-2-7b) has a pytorch .pth file consolidated.00.pth that is ~13.5GB in size. The hugging face transformers compatible model meta-llama/Llama-2-7b-hf has three pytorch model files that are together ~27GB in size and two safetensors file that are together around 13.5Gb.

Could someone please explain the reason for the big difference in file sizes?

I could not find an explanation in the huggingface model cards or in their blog Llama 2 is here - get it on Hugging Face.

Update: When the models are downloaded to huggingface cache, I noticed that only the safetensors are downloaded and not the Pytorch binary model files. This avoids downloading both the safetensors and pytorch model files.

Kumar Saurabh
  • 711
  • 7
  • 7
  • I noticed this too, as I wanted to see how training (fine-tuning) works, I had to download the hf (13 GB so far) instead of the GPTQ (3,62 GB) – Emerson Aug 09 '23 at 09:59

0 Answers0