7

I'd like to create a table from a nested JSON in Athena. The solutions described here using tools like hive Openx-JsonSerDe attempt to mirror the JSON data in the SQL statement. I just want to get a few fields from the JSON file and create the table. I can't seem to find any resources on how to do that.

E.g. JSON file {"records": [{"a": "data1", "b": "data2", "c": "data3"}]} The table I'd like to create just only has columns a and b

David דודו Markovitz
  • 42,900
  • 6
  • 64
  • 88
ebnius
  • 910
  • 2
  • 8
  • 14

1 Answers1

10

I think what you are trying to achieve is unnesting the array to transform one array entry into one row.

This is possible through the correct querying of your data structure.

table definition:

CREATE external TABLE complex (
   records array<struct<a:string,b:string>>
   )
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://bucket/test1/';

query:

select record.a,record.b from complex 
cross join UNNEST(complex.records) as t1(record);
jens walter
  • 13,269
  • 2
  • 56
  • 54
  • 1
    Is it possible to flatten the structure upon the table creation? And avoid the `.`s in the following queries? – Dror Feb 17 '20 at 14:09