0

I am looking to this tutorial. I would like to know is the anyway to distribute large amount of file over the different objects. As the example let's say I have video file with size 60 GB and I have S3 bucklets with size 4 x 15 GB. Now how can I split my file for keeping that at these size storages. I will be happy if you can share any tutorial.

Qaqa Leveo
  • 127
  • 1
  • 1
  • 6

2 Answers2

1

S3 buckets don't have restrictions on size so there is typically no reason to split a file across buckets.

If you really want to split the file across buckets (and I would not recommend doing this) you can write the first 25% of bytes to an object in bucket A, the next 25% of bytes to an object in bucket B, etc. But that's moderately complicated (you have to split the source file and upload just the relevant bytes) and then you have to deal with combining them later in order to retrieve the complete file.

Why do you want to split the file across buckets?

jarmod
  • 71,565
  • 16
  • 115
  • 122
  • Thank you for answer. Problem is I need to do that for my project. So would like to know how can I split the file in uploading or what is a way to split this to multiple storage – Qaqa Leveo Dec 21 '17 at 15:44
  • Use client-side file i/o to read the relevant portion of the source file and then write that using use boto3's put_object(ACL='private', Body=b'bytes', ...) – jarmod Dec 21 '17 at 17:31
  • Thank you so much. Your comment was very helpful. Thank you again – Qaqa Leveo Dec 21 '17 at 21:11
1

Check this AWS documentation out. I think it would be useful.

http://docs.aws.amazon.com/AmazonS3/latest/dev/UploadingObjects.html

The import part of the link is below:

Depending on the size of the data you are uploading, Amazon S3 offers the following options:

Upload objects in a single operation—With a single PUT operation, you can upload objects up to 5 GB in size. For more information, see Uploading Objects in a Single Operation. Upload objects in parts—Using the multipart upload API, you can upload large objects, up to 5 TB. The multipart upload API is designed to improve the upload experience for larger objects. You can upload objects in parts. These object parts can be uploaded independently, in any order, and in parallel. You can use a multipart upload for objects from 5 MB to 5 TB in size. For more information, see Uploading Objects Using Multipart Upload API. We recommend that you use multipart uploading in the following ways:

If you're uploading large objects over a stable high-bandwidth network, use multipart uploading to maximize the use of your available bandwidth by uploading object parts in parallel for multi-threaded performance. If you're uploading over a spotty network, use multipart uploading to increase resiliency to network errors by avoiding upload restarts. When using multipart uploading, you need to retry uploading only parts that are interrupted during the upload. You don't need to restart uploading your object from the beginning. For more information about mutipart uploads, see Multipart Upload Overview.