I can read data from S3 location using Spark and Glue without issues but when trying to read with Athena for the same table - getting error when running select * from mytable limit 10;
HIVE_CURSOR_ERROR: Can not read value at 0 in block 0 in file
s3://.../part-00073-123-926b-456-c000.snappy.parquet
What could be the issue and how to fix it?
I tried with:
MSCK REPAIR TABLE mytable;
That did not help (getting the same error).
Table create statement is:
CREATE EXTERNAL TABLE `mytable`(
co1 ..,
col2 ..
)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
's3://.../../'
TBLPROPERTIES (
'CrawlerSchemaDeserializerVersion'='1.0',
'CrawlerSchemaSerializerVersion'='1.0',
'UPDATED_BY_CRAWLER'='raw_1',
'averageRecordSize'='105',
'classification'='parquet',
'compressionType'='none',
'objectCount'='155',
'recordCount'='33459791',
'sizeKey'='1738251189',
'typeOfData'='file')