I am new to athena and am currently working on reading files from s3 and presenting them as table.
Below is how individual files look like in s3 buckets.
sample.json
[
{
"name":"John",
"age":29,
"salary":300
},
{
"name":"Mary",
"age":25,
"salary":500
}
]
I tried creating a table with this data but I am not able to get it in the desire format. I would like to get each dictionary to take up one row. The table should have three columns (Name, Age, Salary).
Tried below but it's putting entire content under data column as it is.
CREATE EXTERNAL TABLE IF NOT EXISTS test(
`data` string
)
LOCATION 's3://somelocation/'
TBLPROPERTIES ('has_encrypted_data'='false');
output
data
-----
[{"name":"John",...},{{"name":"Mary",...}]
Referred this but didn't reach anywhere.
Is there an ideal way to do this ? FYI, the number of dict present in each file may vary and is not at all fixed. So I am looking for a dynamic solution which works good if there is just one dict and if there are 10 in each file in s3.