I have a very large json file that consists of a single json object in an array that I'm trying to query via AWS Athena.
[ {"reporting_entity_name": "Aetna Life Insurance Company",
"reporting_entity_type": "Third Party Administrator",
"last_updated_on":"2022-09-05",
"providers": [...],
"in-network": [...]} ]
Here's the table definition I created in Athena
CREATE EXTERNAL TABLE IF NOT EXISTS aetna.adobe_temp_5 (
plan string
)
LOCATION 's3://insurance-transparency-data/'
TBLPROPERTIES ('has_encrypted_data' = 'false');
I tried using adapting the query suggested in the answer to this question and got the following error
SELECT plan, e FROM adobe_temp_5 CROSS JOIN UNNEST(CAST(json_parse(plan) AS array(json))) t(e)
INVALID_CAST_ARGUMENT: Cannot cast to array(json). Expected a json array, but got { {"reporting_entity_name": "Aetna Life Insurance Company", "reporting_entity_type": "Third Party Administrator"...
I'm new to presto and ultimately want to be able to query the data that is nested in this JSON object, but I am having a terrible time just accessing the object to query the basics.