You say that you have "100s of millions of files", so I shall assume you have 400 million objects, making 40TB of storage. Please adjust accordingly. I have shown my calculations so that people can help identify my errors.
Initial upload
PUT requests in Amazon S3 are charged at $0.005 per 1,000 requests
. Therefore, 400 million PUTs would cost $2000. (.005*400m/1000
)
This cost cannot be avoided if you wish to create them all as individual objects.
Future uploads would be the same cost at $5 per million.
Storage
Standard storage costs $0.023 per GB
, so storing 400 million 100KB objects would cost $920/month. (.023*400m*100/1m
)
Storage costs can be reduced by using lower-cost Storage Classes.
Access
GET requests are $0.0004 per 1,000 requests
, so downloading 1 million objects each month would cost 40c/month. (.0004*1m/1000
)
If the data is being transferred to the Internet, Data Transfer costs of $0.09 per GB
would apply. The Data Transfer cost of downloading 1 million 100KB objects would be $9/month. (.09*1m*100/1m
)
Analysis
You seem to be most fearful of the initial cost of uploading 100s of millions of objects at a cost of $5 per million objects.
However, storage will also be high, and the cost of $2.30/month per million objects ($920/month for 400m objects). That ongoing cost is likely to dwarf the cost of initial uploads.
Some alternatives would be:
- Store the data on-premises (disk storage is $100/4TB, so 400m files would require $1000 of disks, but you would want extra drives for redundancy), or
- Store the data in a database: There are no 'PUT' costs for databases, but you would need to pay for running the database. This might work out a lower cost. or
- Combine the data in the files (which you say you do not wish to do), but in a way that can be easily split-apart. For example, marking records by an identifier for easy extractions. or
- Use a different storage service, such as Digital Ocean, who do not appear to have a 'PUT' cost.