Questions tagged [presto]

Presto is an open source distributed SQL query engine for running analytic queries against data sources of all sizes ranging from gigabytes to petabytes. The community version of Presto is now called Trino. Amazon serverless query service called Athena is using Presto under the hood.

What is Presto?

Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.

Presto was designed and written from the ground up for interactive analytics and approaches the speed of commercial data warehouses while scaling to the size of organizations like Facebook.

What can it do?

Presto allows querying data where it lives, including Hive, HBase, relational databases or even proprietary data stores. A single Presto query can combine data from multiple sources, allowing for analytics across your entire organization.

Presto is targeted at analysts who expect response times ranging from sub-second to minutes. Presto breaks the false choice between having fast analytics using an expensive commercial solution or using a slow "free" solution that requires excessive hardware.

References

3114 questions
6
votes
5 answers

File formats supported by Presto

What are file formats supported by Presto? Is there any specific file formats recommended for better performance. I would be interested to know if there is any columnar file format like RCfile that's optimized for Presto?
Animesh Raj Jha
  • 2,704
  • 1
  • 21
  • 25
5
votes
2 answers

Get JSON object keys as array in Presto/Trino

I have JSON data like this in one of my columns {"foo": 1, "bar": 2} {"foo": 1} and I would like to run a query that returns the keys as an array foo,bar foo
Boris Verkhovskiy
  • 14,854
  • 11
  • 100
  • 103
5
votes
0 answers

S3 Select with Presto

I am trying out S3 Select from Presto using hive connector and Minio Object store. I am able to create an external table and run all the SQL queries. But, S3 Select does not seem to be working, even with the hive.s3select-pushdown.enabled=true set…
5
votes
1 answer

Construct json from data using Presto

If I have a data in a table as follows WITH dataset AS ( SELECT ARRAY[ CAST(ROW('Bob', 38) AS ROW(name VARCHAR, age INTEGER)), CAST(ROW('Alice', 35) AS ROW(name VARCHAR, age INTEGER)), CAST(ROW('Jane', 27) AS ROW(name VARCHAR, age…
Ram
  • 189
  • 1
  • 4
  • 19
5
votes
1 answer

How can I hash a string to a bigint in presto?

I have a long string and I would like to semi-uniquely represent it as a bigint. Ideally I'd just take the hash, but presto hash functions seem to want to return "varbinary", and I can't find a function to convert a varbinary into a bigint. If I…
erbert
  • 153
  • 1
  • 4
5
votes
1 answer

how can I turn column into rows in presto?

I want to turn column into a new rows and save the values. For example: BEFORE NAME COMDEDY HORROR ROMANCE brian 10 20 14 tom 20 10 11 AFTER NAME GANRE RATING brian comedy 10 brian horror 20 brian …
nowheretogo
  • 125
  • 1
  • 5
5
votes
1 answer

AWS Athena (Presto) how to transpose map to columns

AWS Athena query question; I have a nested map in my rows, of which I would like to transpose the keys to columns. I could name the columns explicitly like items['label_a'], but in this case the keys are actually dynamic... From these rows: {id=1,…
Denn0
  • 377
  • 3
  • 15
5
votes
4 answers

How to count array elements occurrences in Presto?

I have an array in Presto and I'd like to count how many times each element occurs in it. For example, I have [a, a, a, b, b] and I'd like to get something like {a: 3, b: 2}
Max Mir
  • 59
  • 1
  • 3
5
votes
1 answer

How to cast varbinary to varchar in presto

I have the following query where shopname is stored as varbinary instead of varchar type. select shopname, itemname from shop_profile where cast(shopname as varchar) = 'Starbucks'; This query returns an error "line 4:7: Cannot cast varbinary to…
adelle
  • 189
  • 2
  • 6
  • 13
5
votes
4 answers

Presto vs Impala: architecture, performance, functionality

Could you highligh major differences between the two in architecture & functionality in 2019? And how that differences affect performance? For some reason this excellent question was tagged as opinion-based. Extra-question: why Amazon decide to go…
VB_
  • 45,112
  • 42
  • 145
  • 293
5
votes
3 answers

In Athena how do I query a member of a struct in an array in a struct?

I am trying to figure out how to query where I am checking the value of usage given the following table creation: CREATE EXTERNAL TABLE IF NOT EXISTS foo.test ( `id` string, `foo` struct< usages:array< struct< usage:string, …
NSA
  • 5,689
  • 8
  • 37
  • 48
5
votes
1 answer

presto: convert array to rows?

i have a table with array columns all_available_tags and used_tags. example row1: all_available_tags:A,B,C,D used_tags:A,B example row2: all_available_tags:B,C,D,E,F used_tags:F I want to get distinct set of all_available_tags from all rows and…
user21479
  • 1,179
  • 2
  • 13
  • 21
5
votes
1 answer

Date Between (Start & Now)

Not sure how to use the NOW() function in presto. Seems like it should be straight forward, but i'm getting no luck SELECT DISTINCT field FROM table WHERE field BETWEEN '2019-01-01' and NOW() field = varchar
urdearboy
  • 14,439
  • 5
  • 28
  • 58
5
votes
1 answer

AWS Athena: Convert a comma delimited string into rows

In AWS Athena, I want to write a query like this: SELECT some_function('row1,row2,row3'); And get back row1 row2 row3 How do I do this? I know I can write this instead, but it's less convenient for me: select * from (values ('row1'), ('row2'),…
Daniel Kaplan
  • 62,768
  • 50
  • 234
  • 356
5
votes
1 answer

Is there a pseudocolumn in Hive/Presto to get the "last modified" timestamp of a given file?

I have an external table in Athena linked to a folder in S3. There are some pseudocolumns in Presto that allows me to get some metadata information about the the files sitting in that folder (for example, the $path pseudocolumn). I wonder if there…
d4nielfr4nco
  • 635
  • 1
  • 6
  • 17