Questions tagged [amazon-athena]

Amazon Athena is a service for running SQL queries against data stored on Amazon S3. Amazon Athena is part of Amazon Web Services (AWS).

Amazon Athena is a service for running SQL queries against data stored in files on Amazon S3. Amazon Athena is part of Amazon Web Services (AWS).

Athena is powered by the Presto query engine and uses Apache Hive Metastore for database and table definitions. It supports both dynamic and static partitions for tables. Athena supports data stored in delimited text files, JSON, ORC, Avro, and Parquet.

Athena is a serverless tool - there is no infrastructure to manage, and cost is calculated by the quantity of data scanned during each query.

See the Athena Documentation for more.

3440 questions
1
vote
0 answers

String/INT96 to Datatime - Amazon Athena/SQL - DDL/DML

I have hosted my data on S3 Bucket in parquet format and i am trying to access it using Athena. I can see i can successfully access the hosted table. I detected something fishy when i try to access a column "createdon". createdon is a timestamp…
Dinesh R
  • 21
  • 1
1
vote
1 answer

Long delay in querying Athena using Python

I wanted to ask the AWS community a question. I recently shifted to Athena, and have the following observation: It takes much more time to query data using pyathena (python client) than doing it straight in athena. I have a database of customer…
1
vote
1 answer

Array of JSON in Athena is read incorrectly and can't be unnested

I have column called uf that contains an array of JSON objects. Here is a mockup: [ {"type": "browserId", "name": "", "value": "unknown"}, {"type": "campaign", "name": "", "value": "om_227dec0082a5"}, {"type": "custom", "name":…
Moseleyi
  • 2,585
  • 1
  • 24
  • 46
1
vote
3 answers

presto syntax for csv external table with array in one of the fields

I'm having trouble creating a table in Athena - that points at files with the following format: string, string, string, array. when I wrote the file - I delimited the array items with '|'. I delimited each line with '\n' and each column with…
liormayn
  • 203
  • 3
  • 12
1
vote
1 answer

How to write calculated field formula in AWS QuickSight using ifelse

I am trying to write a formula that would calculate a discounted cost depending on the date. Any costs that accrue after May 2019 would have a discount rate of 7% and anything prior to that would be 6%. This is what I have for the formula but it's…
1
vote
1 answer

Deploying cube.js using serverless framework results in an error

I am trying to deploy cube.js project using serverless framework on aws and when I access the endpoint produced by serverless, it results in the following error on the browser Cannot GET / Here is my serverless.yml file service:…
1
vote
0 answers

Disable connection validation when using tomcat connection pool

Im using tomcat connection pool with athena database, and i would like to disable requests validation because It provide lot of latence. for each request I get 2 extra requests like 'Select 1' and 'Select * (myRequest) limit 0' Is it possible to…
Sfayn
  • 190
  • 1
  • 2
  • 13
1
vote
1 answer

Unpivot Columns inside of Amazon Athena without hardcoding

I am writing a query inside of AWS Athena. The Origianl Table is something like: employee|manager1|manager2|manager3|... | manager10 12345|A . |B . |C . |... | (null) 54321|I . |II . |III . |... | X And the result should…
Di Chu
  • 23
  • 1
  • 6
1
vote
3 answers

Alternative to create more than 100 partitions on Athena CTAS

I'm currently creating some new tables from information stored in Amazon S3. First time using AWS, today I learn that Amazon Athena can't create more than 100 partitions from a CTAS query. I'm doing the transformations using sql, it works perfectly,…
Alejandro
  • 519
  • 1
  • 6
  • 32
1
vote
3 answers

How to store aws athena output from python script in excel?

I am querying from aws athena using python script and pyathena library and I'm getting the correct output in the form of table. Output Now the problem is I want to store the output in excel. Can anyone suggest me, using python script how i can store…
user9809879
1
vote
0 answers

AWS Athena - error - Can not read value at 0 in block 0 in file s3://

I can read data from S3 location using Spark and Glue without issues but when trying to read with Athena for the same table - getting error when running select * from mytable limit 10; HIVE_CURSOR_ERROR: Can not read value at 0 in block 0 in…
Joe
  • 11,983
  • 31
  • 109
  • 183
1
vote
0 answers

Data Inconsistencies After ETL in QuickSight?

I have raw data in RDS. I am using AWS Glue to crawl the data, and export the data through an ETL script in Glue (a full exact copy, no transformations or edits for now) to S3 as a single CSV file. I am trying to visualise this data in QuickSight…
1
vote
3 answers

Looking to fill missing time periods with zero for each ID (Athena SQL)

In my data, I'm looking at quarterly counts per user. For all of the quarters that are missing, I'd like to impute them with 0. Here is an example: Here is my current dataset: ID Qtr Count 1 2018Q1 1 1 2018Q3 1 1 2018Q4 2 2 2018Q1 4 2 2018Q2…
madsthaks
  • 377
  • 1
  • 6
  • 16
1
vote
0 answers

Schema deployment management for Athena

In order to apply devops principles to data (ugh, dataops!), things like continuous deployment need to be considered. Hence why tools like dbDeploy exist. However dbDeploy seems to have been orphaned and is not maintained any more. In the past…
Codek
  • 5,114
  • 3
  • 24
  • 38
1
vote
3 answers

Split and search comma separated column in Presto (AWS Athena)

I have the following table my_table, where both the columns are strings- +------------+-------------+ | user_id| code | +------------+-------------+ | ABC123| yyy,123,333| | John| xxx,USA,555| | qwerty| 55A,AUS,666| | …
kev
  • 2,741
  • 5
  • 22
  • 48
1 2 3
99
100