Questions tagged [amazon-athena]

Amazon Athena is a service for running SQL queries against data stored on Amazon S3. Amazon Athena is part of Amazon Web Services (AWS).

Amazon Athena is a service for running SQL queries against data stored in files on Amazon S3. Amazon Athena is part of Amazon Web Services (AWS).

Athena is powered by the Presto query engine and uses Apache Hive Metastore for database and table definitions. It supports both dynamic and static partitions for tables. Athena supports data stored in delimited text files, JSON, ORC, Avro, and Parquet.

Athena is a serverless tool - there is no infrastructure to manage, and cost is calculated by the quantity of data scanned during each query.

See the Athena Documentation for more.

3440 questions

vote

1 answer

PrestoDB/AWS Athena- Retrieve a large SELECT by chunks

I have to select more than 1.9 billion rows. I am trying to query a table hosted in a DB in AWS ATHENA console. The table is reading parquet files from the a S3 bucket. When I run this query: SELECT * FROM ids WHERE org = 'abcd' AND idkey =…

asked Dec 04 '19 at 22:58

Raj

vote

2 answers

Amazon Athena (Presto) SELECT statement to create (n^2 + n)/2 (th triangular number)

I'm using Athena and trying to find a way to create a select statement that will return a sequence in the below format: Numer 1 2 2 3 3 3 4 4 4 4 And so on, up to 200. Is it even possible?

presto amazon-athena

asked Dec 02 '19 at 13:16

Byrdziu

vote

1 answer

HIVE_INVALID_METADATA: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe: columns has 43 elements while columns.types has 34 elements

I'm working on a client platform. It is a DataLake linked to AWS S3 and AWS ATHENA I've uploaded a Dataset to an S3 Bucket using AWS GLUE. The job ran successfully and a table was created under ATHENA. When I try to "Preview" the content of the…

amazon-s3 hive amazon-athena

asked Nov 28 '19 at 14:17

Ben Yedder Saif

vote

2 answers

Connect to AWS Athena with JDBC driver when AWS access is given through (Azure) OIDC provider

We connect to AWS through our office365 users given to us by our company. Since we assume a role in the UI, how would I connect to Athena from a JDBC tool like DataGrip?

amazon-web-services jdbc openid-connect amazon-iam amazon-athena

asked Nov 28 '19 at 11:40

pascalwhoop

2,984
3
26
40

vote

1 answer

Redshift Spectrum much slower than Athena?

Our data is stored in S3 as JSON without partitions. Until today we were using only athena but now we tried Redshift Spectrum. We are running the same query twice. Once using Redshift Spectrum and once using Athena. Both connect to the same data in…

amazon-s3 tableau-api amazon-athena amazon-redshift-spectrum

asked Nov 27 '19 at 12:33

AJ222

1,114
2
18
40

vote

1 answer

SQL query to get week number from date while setting week starting from Sunday

I am trying to run a query in AWS Athena to get the week number from a date. I need the week to start on Sunday. I have tried the below query and it works but it considers the week starts from Monday. Extract(week from date) as week_number Google…

sql datetime presto amazon-athena

asked Nov 25 '19 at 17:56

man utd

vote

1 answer

Athena: is datatype `struct` equivalent to `map`?

My dataflow is: json -> parquet -> Athena, but I'm having a problem with the nested key-value field tags. The JSON file is [{"myid":1,"name":"foo","tags":{"tag1":"a","tag2":"b"}}, {"myid":2,"name":"bar","tags":{"tag1":"c","tag2":"d"}} ] The Athena…

amazon-athena

asked Nov 23 '19 at 12:00

Shadi

9,742
4
43
65

vote

1 answer

Is there a way to pull data-range in AWS Athena when the timestamp is big int using mySQL & PrestoDB?

I am trying to query from a table where I am particularly interested to pull the last 7 days from the timestamp column. The timestamp column has been stored in the table as big int. I tried using this after converting the 'big int' timestamp to…

mysql timestamp amazon-athena presto date-arithmetic

asked Nov 20 '19 at 21:14

Raj

vote

2 answers

How to skip documents that do not match schema in Athena?

Suppose I have an external table like this: CREATE EXTERNAL TABLE my.data ( `id` string, `timestamp` string, `profile` struct< `name`: string, `score`: int> ) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' WITH SERDEPROPERTIES…

amazon-athena

asked Nov 19 '19 at 14:15

sdgfsdh

33,689
26
132
245

vote

2 answers

SQL partially order results

I have a table in Amazon Athena (a Presto engine) of some user events that among others have userId (VARCHAR) and ts (timestamp, BIGINT) column. The table is quite big, some hundreds of millions of records. I would like to create a query that orders…

sql performance amazon-web-services presto amazon-athena

asked Nov 19 '19 at 08:24

MrTJ

13,064
4
41
63

vote

1 answer

How do I identify problematic documents in S3 when querying data in Athena?

I have a basic Athena query like this: SELECT * FROM my.dataset LIMIT 10 When I try to run it I get an error message like this: Your query has the following error(s): HIVE_BAD_DATA: Error parsing field value for field 2: For input string:…

amazon-s3 amazon-athena

asked Nov 18 '19 at 16:42

sdgfsdh

33,689
26
132
245

vote

2 answers

SQL to get next monday in Presto

I need to get the date for next Monday irrespective of which day the sql is executed. I think the logic 7 - day_of_week(current_date)%7 + 1 would work, but then how to get the date. select current_date; _col0 1 2019-11-16 select (7 -…

sql datetime presto amazon-athena date-arithmetic

asked Nov 16 '19 at 00:51

Anand

vote

1 answer

Force athena/presto to respect ordering of files from S3

I have a file where the ordering is important. Essentially the file contains multiple record types, and first record of the group defines the ID that links subsequent records, until another "first" record is found. There can be any number of…

presto amazon-athena

asked Nov 15 '19 at 11:07

Codek

5,114
3
24
38

vote

1 answer

How to connect Athena with Power BI using ODBC

How to connect AWS Athena with Power BI using ODBC. I have installed 64 bit Simba Athena Connection. I have added appropriate bucket policy. { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", …

amazon-s3 powerbi amazon-athena

asked Nov 14 '19 at 16:06

Srihari Karanth

2,067
2
24
34

vote

1 answer

Best way to save pandas DataFrame to parquet with date type

I'm trying to save DataFrame with date type column to a parquet format to be used later in Athena. As far as I understand parquet has native DATE type, by the only type I can really use is datetime64[ns] with pyarrow engine (here is the same issue …

python pandas amazon-athena pyarrow

asked Nov 14 '19 at 10:23

kismsu

1,049
7
22

Prev 1 2 3

…

99 100 Next