How do I convert String to date in AWS Glue?

Question

When I run crawler from Glue on parquet/csv file in S3 bucket, It takes Date as a string. I changed in edit schema and set it to Date. Not only date but it's not changing any data type in edit schema. When I fire query from Athena "select order_date from parque_sales", the error is:

HIVE_BAD_DATA: Field order_date's type BINARY in parquet is incompatible with type date defined in table schema

I also tried changing schema in Glue studio, the result is same!

Thanks in advance! Please help!

I am using Glue crawler to save the meta data of the file on the S3 bucket in the Glue catalog and using Athena to run queries on the glue tables. But edit schema in Glue databse as well as glue studio which is new service on amazon not working. Seems like custom transformation will work but I am confused as the glue generates his own script. — SK Sayyad, Feb 10 '21 at 22:09

score 1 · Answer 1 · answered Feb 10 '21 at 23:26

1

It seems that you have files where date is stored as string in parquet, not as date. The crawler still receives those as string values and fails while comparing the source schema with the defined schema.

To avoid this issue, you should use explicit casting in the athena query. Something like this :

select date_parse(datestring,'%Y-%m-%d %h:%i:%s')

answered Feb 10 '21 at 23:26

Nishu Tayal

20,106
8
49
101

I tried - select date_parse(last_update,'%Y-%m-%d %h:%i:%s') from city; it gives error like INVALID_FUNCTION_ARGUMENT: Invalid format: ""2006-02-15 04:45:25"" – SK Sayyad Feb 18 '21 at 03:13

How do I convert String to date in AWS Glue?

1 Answers1