3

Trying to process large file in AWS Lamba and skipping through the whole file seems a bit wasteful. Is there a "range read" function that allows to read only predefined byte range from S3 file?

olekb
  • 638
  • 1
  • 9
  • 28

1 Answers1

4

Yes, this is possible. According to S3 documentation of GET Object in the REST API, it supports use of the HTTP Range header.

Range

Downloads the specified range bytes of an object. For more information about the HTTP Range header, go to http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35.

In the example syntax:

GET /ObjectName HTTP/1.1
Host: BucketName.s3.amazonaws.com
Date: date
Authorization: authorization string (see Authenticating Requests (AWS Signature Version 4))
Range:bytes=byte_range

Popular S3 client libraries, such as the AWS SDK for Java provide convenient client-side APIs for specifying the range information.

Community
  • 1
  • 1
Chris Nauroth
  • 9,614
  • 1
  • 35
  • 39
  • I see all over that it's not possible? Do you know if it is recently added? – olekb Jul 14 '17 at 17:42
  • 2
    @olekb , I'm not certain when it was added, but I know it was usable at least since early 2016. My experience is using it in Apache Hadoop in the [`S3AInputStream`](https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java#L154-L155) class. – Chris Nauroth Jul 14 '17 at 17:45