1

I read that the first chunk can be 256MB and each additional chunk can only be 4MB?

Can anyone confirm or deny this?

https://learn.microsoft.com/en-us/rest/api/datalakestore/webhdfs-filesystem-apis

Danielle Laforte
  • 173
  • 2
  • 2
  • 9

1 Answers1

1

Through the REST API, you can transfer files of any size. The preferred way is to use a CREATE followed by a number of APPEND calls. The recommended size for each transfer is 4MB or lower.

It is also possible, although not recommended, to transfer a larger chunk in a single REST API call. You need to use Transfer-Encoding set to Chunked in this mode. See here: https://en.wikipedia.org/wiki/Chunked_transfer_encoding. However, there are some nuances of using it. Any individual chunk in the call can fail. You will need to identify at what point the overall transfer failed in resume in this case. Chunks above 4MB are not guaranteed to be atomically committed either.

Amit Kulkarni
  • 910
  • 4
  • 11