I am using AWS Glue jobs to backup dynamodb tables in s3 in parquet format to be able to use it in Athena.
If I want to use these parquet format s3 files to be able to do restore of the table in dynamodb, this is what I am thinking - read each parquet file and convert it into json and then insert the json formatted data into dynamodb (using pyspark on the below lines)
# set sql context
parquetFile = sqlContext.read.parquet(input_file)
parquetFile.write.json(output_path)
Convert normal json to dynamo expected json using - https://github.com/Alonreznik/dynamodb-json
Does this approach sound right? Are there any other alternatives to this approach?