So we have CSV files in a S3 bucket and when the AWS Glue crawler crawls through all the files its able to identify the schema of a struct field correctly as follows:
struct<installation_id:string,device_id:string>
and the CSV file contents are as follows:
publisher_parameters
"{""installation_id"":""a86f3071-e271-4979-8096-d501b8645c7e"",""device_id"":""864006043285537""}"
"{""installation_id"":""3d3a964e-e78f-428b-93cd-1ddfec1fb6d6"",""device_id"":""e9fe5f395a5a4e67""}"
However, when I am reading using AWS Athena its reading the entire string as the value of installation_id
within our struct. I have tried using the org.apache.hadoop.hive.serde2.OpenCSVSerde
SerDe as well but still facing the same issue.
Does AWS Glue use a SerDe, which is different from what is provided in the table specification, when crawling the files?