I have a S3 folder with 40k++ json file where each of the files has the following format:
[{"AAA": "XXXX", "BBB": "XXXX", "CCC": "XXXX"}]
My purpose is to read these json files (in one S3 folder), combine them into one structured table perhaps to perform some transformation of the data and then load them into a MySQL table. This process will probably be needed to be run on weekly basis.
Any quicker way for doing ETL on this kind of data source? Would appreciate if you have any feasible recommendation. Thanks a lot!
Tried to read each json files through boto3 with something like 'obj.get()['Body'].read()'
(in python), however, the iteration over all the files took me few hours to run.