Questions tagged [presto]

Presto is an open source distributed SQL query engine for running analytic queries against data sources of all sizes ranging from gigabytes to petabytes. The community version of Presto is now called Trino. Amazon serverless query service called Athena is using Presto under the hood.

What is Presto?

Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.

Presto was designed and written from the ground up for interactive analytics and approaches the speed of commercial data warehouses while scaling to the size of organizations like Facebook.

What can it do?

Presto allows querying data where it lives, including Hive, HBase, relational databases or even proprietary data stores. A single Presto query can combine data from multiple sources, allowing for analytics across your entire organization.

Presto is targeted at analysts who expect response times ranging from sub-second to minutes. Presto breaks the false choice between having fast analytics using an expensive commercial solution or using a slow "free" solution that requires excessive hardware.

References

3114 questions
4
votes
1 answer

How to fetch info from curly braces using Athena

I am trying to fetch info using Athena, I don't know much about the Athena service of AWS. Could anyone help me how to fetch this particular info i.e. "Attribute4": "RSQ". I am pasting a demo data in JSON format to give better understanding how the…
Ruby
  • 45
  • 4
4
votes
1 answer

Convert String to Array in Presto Athena

I am new in SQL. I am trying to pass a list in WHERE IN statement. But the values are in a string similar to this: '["id1", "id2", "id3", "id4", "id5", "id6"]' The WHERE IN statement do not accept string but accept it: WHERE IN (SELECT ids FROM…
sql_searcher
  • 43
  • 1
  • 1
  • 5
4
votes
1 answer

AthenaQueryError: Athena query failed: "NOT_SUPPORTED: Unsupported Hive type

I recently ran into the following error "AthenaQueryError: Athena query failed: "NOT_SUPPORTED: Unsupported Hive type", and for this I followed this stack overflow link: converting to timestamp with time zone failed on Athena error: The weird part…
4
votes
1 answer

generate date range between min and max dates Athena presto SQL sequence error

I'm attempting to generate a series of dates in Presto SQL (Athena) using unnest and sequence something similair to generate_series in postgres. my table looks like job_name | run_date A | '2021-08-21' A | '2021-08-25' B |…
Umar.H
  • 22,559
  • 7
  • 39
  • 74
4
votes
0 answers

Change the file format used by to_sql method

This works as expected and creates a new table. But the data is stored in a format that only spark can read. How do I store the data in csv format? from pyathena.pandas.util import to_sql to_sql( mrdf, "mrdf_table3", conn, "s3://" +…
shantanuo
  • 31,689
  • 78
  • 245
  • 403
4
votes
1 answer

SQL presto - cross join unnest null value

I have arrays of different sizes and I want each value in the array to be in separate rows. To do that, I have used the cross join unnest. It is working however, it is deleting null array. So, I have my column ID with the different arrays and some…
Aude Hamdi
  • 53
  • 1
  • 3
4
votes
1 answer

SQL query with linear interpolation and Group By

I have a datalake on AWS, queried using Athena, with the following structure and sample data Key | Date | Value ----+---------------+------- a | 01/01/2020 | 4.5 a | 05/01/2020 | 6 a | 06/01/2020 | 3.2 b | 01/01/2020 …
70ny
  • 748
  • 1
  • 7
  • 22
4
votes
1 answer

Converting scientific notation to decimal in SQL Presto

I'm working with Amazon Athena and there are several columns that come out as scientific notation and I need it to come out as either an integer or a decimal. I've tried doing the answer from this: Convert exponential to number in sql However, that…
shaheed313
  • 135
  • 1
  • 4
  • 12
4
votes
1 answer

How to compute window function for each nth row in Presto?

I am working with a table that contains timeseries data, with a row for each minute for each user. I want to compute some aggregate functions on a rolling window of N calendar days. This is achieved via SELECT SOME_AGGREGATE_FUN(col) OVER ( …
gshpychka
  • 8,523
  • 1
  • 11
  • 31
4
votes
0 answers

Recursively operation with the previous line on a same column (AWS Athena = Presto SQL)

I am trying to convert formulas from a Google Sheet file to SQL. I get stuck on a recursive formula (calling on the result of the previous line of the same column). Here is the table I am working with timestamp var1 1602460890 1 1602460900 …
David
  • 450
  • 3
  • 14
4
votes
1 answer

Is there a way in presto/athena to get column names based on a condition and use that in group by?

Is there a way in athena/presto to get column names based on a condition and use it in group by? SHOW COLUMNS FROM {table_name} This query gives me the column details. But I want to use only the column names in some other query. The structure of…
4
votes
3 answers

how to specify https protcol in SQLAlchemy > Presto connection?

I'm trying to run Hive queries via Presto using SQLAlchemy. It uses LDAP authentication but I'm missing something in the connection string. from sqlalchemy.engine import create_engine conn_string = 'presto://' + user + ':' + passw + '@' + host +…
Chuck
  • 1,061
  • 1
  • 20
  • 45
4
votes
2 answers

Amazon Athena: How to drop all partitions at once

My partitions look like these event_year=2019/event_week=37/event_date=2019-09-10 event_year=2019/event_week=42/event_date=2019-10-13 event_year=2019/event_week=8/event_date=2019-02-20 event_year=2020/event_week=24/event_date=2020-06-15 There are…
abcsjdj26
  • 41
  • 1
  • 2
4
votes
1 answer

QuickSight could not generate any output column after applying transformation Error

I am running a query that works perfectly on AWS Athena however when I use athena as a data source from quicksight and tries to run query it keeps on giving me QuickSight could not generate any output column after applying transformation error…
haneulkim
  • 4,406
  • 9
  • 38
  • 80
4
votes
2 answers

How to deal with a column that has just date (%m-%d-%Y) and datetime ('%m-%d-%Y %H:%i') records in the same column in AWS Athena?

I'm trying to convert a date column of string type to date type. I use the below query in AWS Athena: SELECT col0, col1, col2, date_parse(replace(col3, '/', '-'), '%m-%d-%Y') AS start_date FROM "bucket"."table" WHERE col3 <> '' This works for some…
bourgeoisna
  • 95
  • 1
  • 11