I am working on a Data processing application hosted as a web service on an EC2, each second a small data file (less than 10KB) in .csv format is generated.
Problem Statement: Archive all the data files generated to Amazon Glacier.
My Approach : As data files are very small. I store the files in AWS Kinesis and after few hours i flush data to S3 (because i cannot find a direct way to put data from Kinesis to Glacier) and using S3 lifecycle management at the end of the day i archive all the objects to Glacier.
My Questions :
Is there a way to transfer data to Glacier directly from Kinesis ?
Is it possible to configure Kinesis to flush data to S3/Glacier at the end of the day ? Is there any time or memory limitation upto which Kinesis can hold data ?
If Kinesis cannot transfer data to Glacier directly. Is there a workaround for this like - can i write a lambda function which can fetch data from Kinesis and archive it to Glacier ?
Is it possible to merge all the .csv file at Kinesis or S3 or Glacier level ?
Is Kinesis suitable for my usecase ? Is there anything else i can use ?
I would be grateful if someone can take time and answer my questions and point me to some references. Please let me know if there is a flaw in my approach or if there is a better way to do so.
Thanks.