I need Multi-Part DOWNLOADS from Amazon S3 for huge files

Question

I know Amazon S3 added the multi-part upload for huge files. That's great. What I also need is a similar functionality on the client side for customers who get part way through downloading a gigabyte plus file and have errors.

I realize browsers have some level of retry and resume built in, but when you're talking about huge files I'd like to be able to pick up where they left off regardless of the type of error out.

Any ideas?

Thanks, Brian

I've been looking for some useful bit of sample code or SDK documentation w/o any luck. The main issue is Amazon doesn't generate the contentMD5 has when you ask for a range of data. So if you have the file partially downloaded, what you really want to do is calculate the MD5 on what you have downloaded and then ask Amazon if that range of bytes has the same hash so you can just append the rest of the file from Amazon. No such API for ("hey Amazon, give me the MD5 for this range of bytes in the file on S3" exists AFAIK :-( — kenyee, Jan 17 '14 at 16:48
Hi Brian. If you were able to get your question answered, can you choose a correct answer? Helps other folks who come to the page looking for that same help. — rICh, Jun 24 '15 at 14:53

score 13 · Answer 1 · answered Jan 28 '11 at 06:18

13

S3 supports the standard HTTP "Range" header if you want to build your own solution.

S3 Getting Objects

answered Jan 28 '11 at 06:18

Uriah Carpenter

6,656
32
28

1

Java API: http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/GetObjectRequest.html#setRange(long,%20long) – Michal Čizmazia Mar 12 '14 at 16:06

score 4 · Answer 2 · answered Jul 29 '11 at 03:51

4

I use aria2c. For private content, you can use "GetPreSignedUrlRequest" to generate temporary private URLs that you can pass to aria2c

answered Jul 29 '11 at 03:51

Ameer Deen

704
8
20

score 2 · Answer 3 · edited Dec 02 '22 at 07:07

2

Just updating for current situation, S3 natively supports multipart GET as well as PUT. https://youtu.be/uXHw0Xae2ww?t=1459.

edited Dec 02 '22 at 07:07

Adolfo

4,969
4
26
28

answered Mar 14 '14 at 21:30

rICh

1,709
2
15
25

score 1 · Answer 4 · answered Dec 17 '21 at 20:08

S3 has a feature called byte range fetches. It’s kind of the download compliment to multipart upload:

Using the Range HTTP header in a GET Object request, you can fetch a byte-range from an object, transferring only the specified portion. You can use concurrent connections to Amazon S3 to fetch different byte ranges from within the same object. This helps you achieve higher aggregate throughput versus a single whole-object request. Fetching smaller ranges of a large object also allows your application to improve retry times when requests are interrupted. For more information, see Getting Objects.

Typical sizes for byte-range requests are 8 MB or 16 MB. If objects are PUT using a multipart upload, it’s a good practice to GET them in the same part sizes (or at least aligned to part boundaries) for best performance. GET requests can directly address individual parts; for example, GET ?partNumber=N.

Source: https://docs.aws.amazon.com/whitepapers/latest/s3-optimizing-performance-best-practices/use-byte-range-fetches.html

score 0 · Answer 5 · answered Aug 10 '17 at 09:07

0

NOTE: For Ruby user only

Try aws-sdk gem from Ruby, and download

object = AWS::S3::Object.new(...)
object.download_file('path/to/file.rb')

Because it download a large file with multipart by default.

Files larger than 5MB are downloaded using multipart method

http://docs.aws.amazon.com/sdkforruby/api/Aws/S3/Object.html#download_file-instance_method

answered Aug 10 '17 at 09:07

kenju

5,866
1
41
41

1

what about java? – Ahmad Shahwaiz Oct 11 '17 at 07:43
3

what about c# ? – Halid Apr 11 '18 at 16:01

I need Multi-Part DOWNLOADS from Amazon S3 for huge files

5 Answers5