Questions tagged [presto]

Presto is an open source distributed SQL query engine for running analytic queries against data sources of all sizes ranging from gigabytes to petabytes. The community version of Presto is now called Trino. Amazon serverless query service called Athena is using Presto under the hood.

What is Presto?

Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.

Presto was designed and written from the ground up for interactive analytics and approaches the speed of commercial data warehouses while scaling to the size of organizations like Facebook.

What can it do?

Presto allows querying data where it lives, including Hive, HBase, relational databases or even proprietary data stores. A single Presto query can combine data from multiple sources, allowing for analytics across your entire organization.

Presto is targeted at analysts who expect response times ranging from sub-second to minutes. Presto breaks the false choice between having fast analytics using an expensive commercial solution or using a slow "free" solution that requires excessive hardware.

References

3114 questions
9
votes
1 answer

How do I run md5() on a bigint in Presto?

select md5(15) returns Query failed (#20160818_193909_00287_8zejd): line 1:8: Unexpected parameters (bigint) for function md5. Expected: md5(varbinary) How do I hash 15 and get back a string? I'd like to select 1 in 16 items at random, e.g. where…
dfrankow
  • 20,191
  • 41
  • 152
  • 214
9
votes
3 answers

how to use presto to query hive data

I just installed presto and when I use the presto-cli to query hive data, I get the following error: $ ./presto --server node6:8080 --catalog hive --schema default presto:default> show tables; Query 20131113_150006_00002_u8uyp failed: Table…
Rui Li
  • 141
  • 1
  • 2
  • 6
9
votes
1 answer

Hardware requirements for Presto

I suspect the answer is "it depends", but is there any general guidance about what kind of hardware to plan to use for Presto? Since Presto uses a coordinator and a set of workers, and workers run with the data, I imagine the main issues will be…
benvolioT
  • 4,507
  • 2
  • 36
  • 30
8
votes
2 answers

How can i cut the left part of the string with unknown legth? (with sql function)

In the ETL process, I receive a varchar field, and the length (of the value) is changed from row to row. I need to keep 5 symbols from the right side of the string. It means that I need to cut the left side but I can't, due to the unknown…
Terpsihora
  • 83
  • 1
  • 1
  • 3
8
votes
1 answer

Running total sum over date presto SQL

I'm trying to calculate the cumulative sum of columns t and s over a date from my sample data below, using Presto SQL. Date | T | S 1/2/19 | 2 | 5 2/1/19 | 5 | 1 3/1/19 | 1 | 1 I would like to get Date | T | S | cum_T | cum_S 1/2/19 | 2 | 5…
user124123
  • 1,642
  • 7
  • 30
  • 50
8
votes
1 answer

How to remove new line characters from data rows in Presto/AWS Athena?

I'm querying some tables on Athena (Presto SAS) and then downloading the generated CSV file to use locally. Opening the file, I realised the data contains new line characters that doesn't appear on AWS interface, only in the CSV and need to get rid…
cristianoms
  • 3,456
  • 3
  • 26
  • 28
8
votes
1 answer

Split one string with commas into columns

For example, I have the following table: | Block | | abcdefgh,12kjkjkj,231wewoxyz| How can I convert it into: | Block1 | Block2 | Block3 | | abcdefgh | 12kjkjkj | 231wewoxyz | Note: Each "Block" has a maximum of 8…
Tai Ngo
  • 97
  • 1
  • 1
  • 5
8
votes
1 answer

PrestoDB: select all dates between two dates

I need to form a report which provides some information per each date within dates interval. I need to have it within a single query (can't create any functions or supporting tables). How can I achieve that in PrestoDB? Note: There are lots of…
Sasha Shpota
  • 9,436
  • 14
  • 75
  • 148
8
votes
1 answer

Create a Presto table with a column as an Array datatype

How does one create a table in Presto with one of the columns having an Array datatype? For example: CREATE TABLE IF NOT EXISTS (ID BIGINT, ARRAY_COL ARRAY)...
8
votes
2 answers

aws athena - cast as json don't return json object

I have a list of json objects (result attribute) as in the example : select result from mytable limit 1 I get : [{hop=1, error=null, result=[{x=null, from=192.168.0.1, rtt=0.378, ttl=64, err=null, ittl=null, edst=null, late=null, mtu=null,…
Hayat Bellafkih
  • 587
  • 2
  • 11
  • 28
8
votes
2 answers

How can I get result format JSON from Athena in AWS?

I want to get result value format JSON from Athena in AWS. When I select from the Athena then the result format like this. {test.value={report_1=test, report_2=normal, report_3=hard}} Is there any way to get JSON format result without replacing "="…
Bethlee
  • 825
  • 3
  • 17
  • 28
8
votes
3 answers

How to get date_diff from previous rows in Presto?

I'm trying to get a diff_date from Presto from this data. timespent | 2016-04-09T00:09:07.232Z | 1000 | general timespent | 2016-04-09T00:09:17.217Z | 10000 | general timespent | 2016-04-09T00:13:27.123Z | 250000 |…
toy
  • 11,711
  • 24
  • 93
  • 176
8
votes
3 answers

Presto unnest json

follwing this question: how to cross join unnest a json array in presto I tried to run the example provided but I get and error while doing so the SQL command: select x.n from unnest(cast(json_extract('{"payload":[{"type":"b","value":"9"},…
Lior Baber
  • 852
  • 3
  • 11
  • 25
7
votes
2 answers

Presto Produce JSON results

I have a json table which was created by CREATE TABLE `normaldata_source`( `column1` int, `column2` string, `column3` struct) A sample data is: { "column1": 9, "column2": "Z", "column3": { "column4": "Y" } } If…
lclankyo
  • 221
  • 3
  • 10
7
votes
1 answer

How to pivot a table in Presto?

Let be a table named data with columns time, sensor, value : I want to pivot this table on Athena (Presto) to get a new table like this one : To do so, one can run the following query : SELECT time, sensor_value['temperature'] as "temperature",…
kakarotto
  • 180
  • 1
  • 11