1

I have a S3 bucket with many JSON files.

JSON file example:

{"id":"x109pri", "import_date":"2017-11-06"}

The "import_date" field is DATE type in standard format YYYY-MM-DD.

I am creating a Database connection in Athena to link all these JSON files.

However, when I create a new table in Athena and specify this field format as DATE I get: "Internal error" with no other explanation provided. To clarify, the table gets created just fine but if I want to preview it or query, I get this error.

However, when I specify this field as STRING then it works fine.

So the question is, is this a BUG or what should be the correct value for Athena DATE format?

pdolinaj
  • 1,087
  • 13
  • 21

1 Answers1

2

The date column type does not work with certain combinations of SerDe and/or data source.

For example using a DATE column with org.openx.data.jsonserde.JsonSerDe fails, while org.apache.hive.hcatalog.data.JsonSerDe works.

So with the following table definition, querying your JSON will work.

create external table datetest(
  id string,
  import_date date
)
ROW FORMAT  serde 'org.apache.hive.hcatalog.data.JsonSerDe'
LOCATION 's3://bucket/datetest'
jens walter
  • 13,269
  • 2
  • 56
  • 54