I'm looking to use Athena Partition Projection to analyze log files from AWS application load balancers and firehose emitted logs. The data in S3 is prefixed with year/month/day and potentially hour as well. I've been able to accomplish using the Firehose Example; however this example uses a string formatted partition column.
I'm looking to see if it's possible to use a date formatted partition column instead (with partition project and the firehose emitted s3 prefix format), as our query writers are already used to most of our queries involving date columns and it avoids the need to string format for relative date queries. Is this possible or would the s3 prefixes need to be changed to accomplish?
Table Properties for String column: WORKS
PARTITIONED BY (
`logdate` string)
TBLPROPERTIES (
'projection.enabled'='true',
'projection.logdate.format'='yyyy/MM/dd',
'projection.logdate.interval'='1',
'projection.logdate.interval.unit'='DAYS',
'projection.logdate.range'='NOW-2YEARS,NOW',
'projection.logdate.type'='date',
'storage.location.template'='s3://bucket/prefix/${logdate}')
Table Properties for Date Partition column Does Not Work
PARTITIONED BY (
`logdate` date)
TBLPROPERTIES (
'projection.enabled'='true',
'projection.logdate.format'='yyyy/MM/dd',
'projection.logdate.interval'='1',
'projection.logdate.interval.unit'='DAYS',
'projection.logdate.range'='NOW-2YEARS,NOW',
'projection.logdate.type'='date',
'storage.location.template'='s3://bucket/prefix/${logdate}')
HIVE_INVALID_PARTITION_VALUE: Invalid partition value '2018/11/13' for DATE partition key: logdate=2018%2F11%2F13