Questions tagged [amazon-athena]

Amazon Athena is a service for running SQL queries against data stored on Amazon S3. Amazon Athena is part of Amazon Web Services (AWS).

Amazon Athena is a service for running SQL queries against data stored in files on Amazon S3. Amazon Athena is part of Amazon Web Services (AWS).

Athena is powered by the Presto query engine and uses Apache Hive Metastore for database and table definitions. It supports both dynamic and static partitions for tables. Athena supports data stored in delimited text files, JSON, ORC, Avro, and Parquet.

Athena is a serverless tool - there is no infrastructure to manage, and cost is calculated by the quantity of data scanned during each query.

See the Athena Documentation for more.

3440 questions
26
votes
1 answer

random sample of size N in Athena

I'm trying to obtain a random sample of N rows from Athena. But since the table from which I want to draw this sample is huge the naive SELECT id FROM mytable ORDER BY RANDOM() LIMIT 100 takes forever to run, presumably because the ORDER BY…
RoyalTS
  • 9,545
  • 12
  • 60
  • 101
24
votes
1 answer

AWS Athena too slow for an api?

The plan was to get data from aws data exchange, move it to an s3 bucket then query it by aws athena for a data api. Everything works, just feels a bit slow. No matter the dataset nor the query I can't get below 2 second in athena response time.…
athomas
  • 343
  • 2
  • 4
24
votes
4 answers

How to convert string into timestamp in Presto (Athena)?

I want to convert datatype of string (eg : '2018-03-27T00:20:00.855556Z' ) into timestamp (eg : '2018-03-27 00:20:00'). Actually I execute the query in Athena : select * from tb_name where elb_status_code like '5%%' AND date between…
Harshit Agrawal
  • 259
  • 1
  • 2
  • 4
24
votes
2 answers

Amazon Athena Convert String to Date

I am looking to convert the following string: mmm-dd-yyyy to a date: yyyy-mm-dd e.g Nov-06-2015 to 2015-11-06 within Amazon Athena
Data_101
  • 893
  • 7
  • 14
  • 25
23
votes
1 answer

Athena: Query exhausted resources at scale factor

I am running a query like: SELECT f.*, p.countryName, p.airportName, a.name AS agentName FROM ( SELECT f.outboundlegid, f.inboundlegid, f.querydatetime, cast(f.agent as bigint) as agent, …
Jiew Meng
  • 84,767
  • 185
  • 495
  • 805
22
votes
6 answers

AWS Athena export array of structs to JSON

I've got an Athena table where some fields have a fairly complex nested format. The backing records in S3 are JSON. Along these lines (but we have several more levels of nesting): CREATE EXTERNAL TABLE IF NOT EXISTS test ( timestamp double, …
Chelsar
  • 373
  • 2
  • 8
22
votes
3 answers

How to pivot rows into columns in AWS Athena?

I'm new to AWS Athena and trying to pivot some rows into columns, similar to the top answer in this StackOverflow post. However, when I tried: SELECT column1, column2, column3 FROM data PIVOT ( MIN(column3) FOR column2 IN…
Louis
  • 1,123
  • 5
  • 15
  • 24
22
votes
1 answer

Checking array in Athena

I have a table in Athena where one of the columns is of type array. However, when I run select * from mytable where array_contains(myarr,'foobar') limit 10 it seems Athena doesn't have the array_contains function: SYNTAX_ERROR: line 2:7:…
dunstantom
  • 551
  • 2
  • 5
  • 10
20
votes
7 answers

Amazon Athena CREATE EXTERNAL TABLE mismatched input 'external' invalidrequestexception

I am trying to create an external table in Amazon Athena. My query is the following: CREATE EXTERNAL TABLE priceTable ( WeekDay STRING, MonthDay INT, price00 FLOAT, price01 FLOAT, price02 FLOAT, price03 FLOAT, price04 FLOAT, price05…
MrMuppet
  • 547
  • 1
  • 4
  • 12
20
votes
1 answer

AWS Athena concurrency limits: Number of submitted queries VS number of running queries

According to AWS Athena limitations you can submit up to 20 queries of the same type at a time, but it is a soft limit and can be increased on request. I use boto3 to interact with Athena and my script submits 16 CTAS queries each of which takes…
Ilya Kisil
  • 2,490
  • 2
  • 17
  • 31
20
votes
4 answers

Add a partition on glue table via API on AWS?

I have an S3 bucket which is constantly being filled with new data, I am using Athena and Glue to query that data, the thing is if glue doesn't know that a new partition is created it doesn't search that it needs to search there. If I make an API…
Gudzo
  • 639
  • 2
  • 8
  • 21
20
votes
2 answers

AWS Glue issue with double quote and commas

I have this CSV file: reference,address V7T452F4H9,"12410 W 62TH ST, AA D" The following options are being used in the table definition ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES ( 'quoteChar'='\"', …
ln9187
  • 730
  • 1
  • 7
  • 23
18
votes
3 answers

converting a struct to a json when querying athena

I have an athena table which I did not create or manage, but can query. one of the fields is a struct type. for the sake of the example let's suppose it looks like this: my_field struct
amit
  • 3,332
  • 6
  • 24
  • 32
17
votes
1 answer

Amazon Athena: Convert bigint timestamp to readable timestamp

I am using Athena to query the date stored in a bigInt format. I want to convert it to a friendly timestamp. I have tried: from_unixtime(timestamp DIV 1000) AS readableDate And to_timestamp((timestamp::bigInt)/1000, 'MM/DD/YYYY HH24:MI:SS') at time…
noobeerp
  • 417
  • 2
  • 6
  • 11
17
votes
2 answers

How to create Athena database via API

I would like to create a database in Athena via API. I have parquet files in S3 that I would like to query using the API and I would like to use Athena for the query. Anyway I can create a database via API for Athena?
condo1234
  • 3,285
  • 6
  • 25
  • 34