Questions tagged [amazon-athena]

Amazon Athena is a service for running SQL queries against data stored on Amazon S3. Amazon Athena is part of Amazon Web Services (AWS).

Amazon Athena is a service for running SQL queries against data stored in files on Amazon S3. Amazon Athena is part of Amazon Web Services (AWS).

Athena is powered by the Presto query engine and uses Apache Hive Metastore for database and table definitions. It supports both dynamic and static partitions for tables. Athena supports data stored in delimited text files, JSON, ORC, Avro, and Parquet.

Athena is a serverless tool - there is no infrastructure to manage, and cost is calculated by the quantity of data scanned during each query.

See the Athena Documentation for more.

3440 questions
15
votes
4 answers

Athena date format unable to convert string to date formate

tried the below syntax none of them helped to convert a string type column to date select INVC_,APIDT,APDDT from APAPP100 limit 10 select current_date, APIDT,APDDT from APAPP100 limit 10 select date_format( b.APIDT, '%Y-%m-%d') from APAPP100…
vinsent paramanantham
  • 953
  • 3
  • 15
  • 34
15
votes
3 answers

Access AWS athena through JPA spring boot

I am trying to use AWS athena using spring boot jpa datasource . I tried setting up datasource with given properties. spring.datasource.driver-class-name=com.amazonaws.athena.jdbc.AthenaDriver …
14
votes
2 answers

AWS Athena - GENERIC_INTERNAL_ERROR: Number of partition values does not match number of filters

I'm querying a table in Athena that is giving the error: GENERIC_INTERNAL_ERROR: Number of partition values does not match number of filters I was able to query it earlier, but added another partition (AWS glue job) to try and optimize joins I will…
Neil Galloway
  • 141
  • 1
  • 1
  • 5
14
votes
3 answers

How to read quoted CSV with NULL values into Amazon Athena

I'm trying to create an external table in Athena using quoted CSV file stored on S3. The problem is, that my CSV contain missing values in columns that should be read as INTs. Simple…
Mikolaj
  • 1,395
  • 2
  • 13
  • 32
14
votes
7 answers

Can AWS Athena update or insert data stored in S3?

The document just says that it is a query service but not explicitly states that it can or cannot perform data update. If Athena cannot do insert or update, is there any other aws service which can do like a normal DB?
kzfid
  • 688
  • 3
  • 10
  • 17
14
votes
2 answers

Athena create table from parquet schema

Is there a way to create a table in Amazon Athena directly from parquet file based on avro schema? The schema is encoded into the file so its seems stupid that I need to actually create the DDL myself. I saw this and also another duplication but…
NetanelRabinowitz
  • 1,534
  • 2
  • 14
  • 26
14
votes
3 answers

How to Query parquet data from Amazon Athena?

Athena creates a temporary table using fields in S3 table. I have done this using JSON data. Could you help me on how to create table using parquet data? I have tried following: Converted sample JSON data to parquet data. Uploaded parquet data to…
rajeswari
  • 279
  • 1
  • 4
  • 13
14
votes
1 answer

Store multiple elements in json files in AWS Athena

I have some json files stored in a S3 bucket , where each file has multiple elements of same structure. For…
Swagatika
  • 857
  • 1
  • 11
  • 32
13
votes
4 answers

Azure Equivalent of AWS Athena over s3

I have an AWS workload that stores csv files in partitions in s3 and then queries the data with SQL queries using Athena, writing the results back to s3. I'm looking for an equivalent behavior in Azure, where I could store csv files in a storage and…
zuckermanori
  • 1,675
  • 5
  • 22
  • 31
13
votes
1 answer

AWS Athena partition fetch all paths

Recently, I've experienced an issue with AWS Athena when there is quite high number of partitions. The old version had a database and tables with only 1 partition level, say id=x. Let's take one table; for example, where we store payment parameters…
null
  • 1,944
  • 1
  • 14
  • 24
13
votes
1 answer

AWS Glue crawler - partition keys types

I am using Spark to write files to S3 in ORC format. Also using Athena to query this data. I am using the following partition keys: s3://bucket/company=1123/date=20190207 Once I execute the Glue crawler to run on the bucket everything works as…
Alex Stanovsky
  • 1,286
  • 1
  • 13
  • 28
13
votes
4 answers

How to query in AWS athena connected through S3 using lambda functions in python

I have my .csv files saved in the S3 Bucket. I am able to query the data of S3 using AWS Athena. Is there any way we can connect the lambda function to athena and query the data from lambda function. please help Thanks
13
votes
2 answers

Athena not adding partitions after msck repair table

I have a Firehose that stores data in S3 in the default directory structure: YY/MM/DD/HH and a table in Athena with these columns defined as partitions: year: string, month: string, day: string, hour: string after running msck repair table clicks I…
Sam
  • 2,761
  • 3
  • 19
  • 30
12
votes
2 answers

HIVE_PARTITION_SCHEMA_MISMATCH

I'm getting this error from AWS Athena: HIVE_PARTITION_SCHEMA_MISMATCH: There is a mismatch between the table and partition schemas. The types are incompatible and cannot be coerced. The column 'id' in table 'db.app_events' is declared as type…
Burak
  • 5,706
  • 20
  • 70
  • 110
12
votes
2 answers

How to solve this HIVE_PARTITION_SCHEMA_MISMATCH?

I have partitioned data in CSV files on S3: s3://bucket/dataset/p=1/*.csv (partition #1) ... s3://bucket/dataset/p=100/*.csv (partition #100) I run a classifier over s3://bucket/dataset/ and the result looks very much promising as it detects 150…
Raffael
  • 19,547
  • 15
  • 82
  • 160