Big difference in the sizes of Llama 2 model files on huggingface hub depending on the format

Asked Jul 19 '23 at 14:04

Active Jul 20 '23 at 18:05

Viewed 3,340 times

The Llama2 7B model on huggingface (meta-llama/Llama-2-7b) has a pytorch .pth file consolidated.00.pth that is ~13.5GB in size. The hugging face transformers compatible model meta-llama/Llama-2-7b-hf has three pytorch model files that are together ~27GB in size and two safetensors file that are together around 13.5Gb.

Could someone please explain the reason for the big difference in file sizes?

I could not find an explanation in the huggingface model cards or in their blog Llama 2 is here - get it on Hugging Face.

Update: When the models are downloaded to huggingface cache, I noticed that only the safetensors are downloaded and not the Pytorch binary model files. This avoids downloading both the safetensors and pytorch model files.

edited Jul 19 '23 at 15:10

asked Jul 19 '23 at 14:04

Kumar Saurabh

I noticed this too, as I wanted to see how training (fine-tuning) works, I had to download the hf (13 GB so far) instead of the GPTQ (3,62 GB) – Emerson Aug 09 '23 at 09:59

Big difference in the sizes of Llama 2 model files on huggingface hub depending on the format

0 Answers0