0

I'd like to pre-calculate the final size of the tar file without actually creating it. I already know the sizes and the names of all the files that will go inside the archive. I guess to have it easy, archive only, without compression.

How can I do that with a tar? Or is there any other archive format which can let me do that?

As a background: I want to read a bunch of files from S3, tar/zip them. And then, put it back in S3 as a new file. I'd like to achieve this without using memory or disk. So single-pass read-write fashion.

S3 however wants me to pass an exact content length, hence the pre-calculation I'm asking for. I can obtain the final size by generating the archive upfront using some /dev/null stream, then do the actual archiving again, but ideally I'd like to avoid that double reading.

ayan ahmedov
  • 391
  • 1
  • 5
  • 23
  • You will need to use some memory to complete a S3->Tar->S3 pipeline, so you can just read in, say, 8 MiB chunks, and upload each chunk as part of a multipart upload, without needing to know the final size. – Anon Coward Jun 05 '23 at 15:39
  • `multiplart` + `memory buffer` seems a legit idea. Thanks! Even better actually, using Zip/Gzip would be preferable with compression – ayan ahmedov Jun 06 '23 at 18:48

0 Answers0