Questions tagged [amazon-athena]

Amazon Athena is a service for running SQL queries against data stored on Amazon S3. Amazon Athena is part of Amazon Web Services (AWS).

Amazon Athena is a service for running SQL queries against data stored in files on Amazon S3. Amazon Athena is part of Amazon Web Services (AWS).

Athena is powered by the Presto query engine and uses Apache Hive Metastore for database and table definitions. It supports both dynamic and static partitions for tables. Athena supports data stored in delimited text files, JSON, ORC, Avro, and Parquet.

Athena is a serverless tool - there is no infrastructure to manage, and cost is calculated by the quantity of data scanned during each query.

See the Athena Documentation for more.

3440 questions
17
votes
1 answer

substr at Amazon Athena

At Amazon Athena, I want to extract only the character string "2017-07-27" from the character string "2017-07-27 12:10:08". SELECT SUBSTRING (event_datetime.s, 0, 10) FROM production limit 10 I tried it like this which only returns numbers 0 to…
Akihiro Seki
  • 321
  • 2
  • 3
  • 7
17
votes
2 answers

AWS Athena and date_format

I have some issue while formatting a timestamp with Amazon Athena service. select date_format(current_timestamp, 'y') Returns just 'y' (the string). The only way I found to format dates in Amazon Athena is trough CONCAT + YEAR + MONTH + DAY…
emmekappa
  • 792
  • 1
  • 6
  • 9
17
votes
2 answers

What does "WITH SERDEPROPERTIES ( 'paths' = 'key1, key2, key3') " really do in Hive DDL json serde?

Much appreciated if anyone can provide a reference to this clause. I have been searching online with little luck.
Da Qi
  • 615
  • 5
  • 10
17
votes
1 answer

Amazon Athena and compressed S3 files

I have an S3 bucket with several zipped CSV files (utilization logs.) I'd like to query this data with Athena, but the output is completely garbled. It appears Athena is trying to parse the zip files without decompressing them first. Is it possible…
MattY
  • 527
  • 2
  • 4
  • 9
16
votes
1 answer

Athena DateDiff

I am trying to do what I think is a simple date diff function but for some reason, my unit value is being read as a column ("dd") so I keep getting a column cannot be resolved error I am using AWS Athena My code is this SELECT "reservations"."id"…
JoeD
  • 223
  • 1
  • 3
  • 5
16
votes
3 answers

Amazon Athena - Column cannot be resolved on basic SQL WHERE query

I am currently evaluating Amazon Athena and Amazon S3. I have created a database (testdb) with one table (awsevaluationtable). The table has two columns, x (bigint) and y (bigint). When I run: SELECT * FROM testdb."awsevaluationtable" I get all…
Joel
  • 399
  • 1
  • 4
  • 16
16
votes
1 answer

Executing Named Queries in Athena

We want to execute a parameterized query in Athena using the javascript sdk by aws. Seems Athena's named query may be the way to do, but the documentation seems very cryptic to understand how to go about doing this. It would be great if someone can…
16
votes
5 answers

How to Convert Many CSV files to Parquet using AWS Glue

I'm using AWS S3, Glue, and Athena with the following setup: S3 --> Glue --> Athena My raw data is stored on S3 as CSV files. I'm using Glue for ETL, and I'm using Athena to query the data. Since I'm using Athena, I'd like to convert the CSV files…
mark s.
  • 656
  • 2
  • 7
  • 14
15
votes
2 answers

StartQueryExecution operation: Unable to verify/create output bucket

I am trying to execute query on Athena using python. Sample code client = boto3.client( 'athena', region_name=region, aws_access_key_id=AWS_ACCESS_KEY_ID, aws_secret_access_key=AWS_SECRET_ACCESS_KEY ) …
NHD
  • 435
  • 1
  • 6
  • 17
15
votes
1 answer

Difference between "ROWS BETWEEN" and "RANGE BETWEEN" in (Presto) window function "OVER" clause

This question is primarily about older versions of PrestoSQL, which have been resolved in the (now renamed) Trino project as of versions 346. However, Amazon's Athena project is based off of Presto versions 0.217 (Athena Engine 2) and 0.172 (Athena…
mbafford
  • 2,266
  • 1
  • 20
  • 25
15
votes
4 answers

AWS Glue cannot create database from crawler: permission denied

I am trying to use an AWS Glue crawler on an S3 bucket to populate a Glue database. I run the Create Crawler wizard, select my datasource (the S3 bucket with the avro files), have it create the IAM role, and run it, and I get the following…
mhamrah
  • 9,038
  • 4
  • 24
  • 22
15
votes
1 answer

Athena equivalent to information_schema

For background, I come from a SQLServer background and make heavy use of the system tables & information_schema, to tell me all about my tables and columns. I didn't expect the exact same power in Athena, but currently very shocked and frustrated…
SimonB
  • 962
  • 1
  • 14
  • 36
15
votes
3 answers

Boto3 athena query without saving data to s3

I am trying to use boto3 to run a set of queries and don't want to save the data to s3. Instead I just want to get the results and want to work with those results. I am trying to do the following import boto3 client =…
muazfaiz
  • 4,611
  • 14
  • 50
  • 88
15
votes
1 answer

Unnesting in SQL (Athena): How to convert array of structs into an array of values plucked from the structs?

I am taking samples from a Bayesian statistical model, serializing them with Avro, uploading them to S3, and querying them with Athena. I need help writing a query that unnests an array in the table. The CREATE TABLE query looks like: CREATE…
Count Zero
  • 630
  • 1
  • 6
  • 15
15
votes
9 answers

AWS Athena Returning Zero Records from Tables Created from GLUE Crawler input csv from S3

Part One : I tried glue crawler to run on dummy csv loaded in s3 it created a table but when I try view table in athena and query it it shows Zero Records returned. But the demo data of ELB in Athena works fine. Part Two (Scenario:) Suppose I Have a…
Kush Vyas
  • 5,813
  • 2
  • 26
  • 36