Questions tagged [presto]

Presto is an open source distributed SQL query engine for running analytic queries against data sources of all sizes ranging from gigabytes to petabytes. The community version of Presto is now called Trino. Amazon serverless query service called Athena is using Presto under the hood.

What is Presto?

Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.

Presto was designed and written from the ground up for interactive analytics and approaches the speed of commercial data warehouses while scaling to the size of organizations like Facebook.

What can it do?

Presto allows querying data where it lives, including Hive, HBase, relational databases or even proprietary data stores. A single Presto query can combine data from multiple sources, allowing for analytics across your entire organization.

Presto is targeted at analysts who expect response times ranging from sub-second to minutes. Presto breaks the false choice between having fast analytics using an expensive commercial solution or using a slow "free" solution that requires excessive hardware.

References

3114 questions
7
votes
3 answers

Presto query: Find the key with maximum value in a map

I have a table Name pets -------------- Andy {dog:2, cat:1, bird:4} John {tiger:3, elephant:1, fish:2} Mary {dog:2, pig:2} I want the find the pet type with maximum count for each person. In the event of tie, duplicate the row for each pet. The…
Edamame
  • 23,718
  • 73
  • 186
  • 320
7
votes
4 answers

Presto equivalent of CONCAT_WS

I'm looking for a function in Presto to concat two columns with a separator like underline.
MTT
  • 5,113
  • 7
  • 35
  • 61
7
votes
3 answers

How to format a number as percentage with 2 decimal places in Presto?

I have a number want to format it as percentage in Presto. I want to format 0.18932 as 18.93%. I tried: format('%s%%', decimal_number*100) format('%s%%', round((a.decimal_number*100), 2)) Here is the result: none formatted: 0.019435618 first…
Jason LiLy
  • 634
  • 2
  • 9
  • 19
7
votes
1 answer

Insert into static hive partition using Presto

Suppose I want to INSERT INTO a static hive partition, can I do that with Presto? The PARTITION keyword is only for hive. INSERT INTO TABLE Employee PARTITION (department='HR') Caused by: com.facebook.presto.sql.parser.ParsingException: line…
Tiberiu
  • 990
  • 2
  • 18
  • 36
7
votes
2 answers

How to cast varchar to MAP(VARCHAR,VARCHAR) in presto

I have table in presto, one column named ("mappings") have key-value pair as string select mappings from hello; Ex: {"foo": "baar", "foo1": "bar1" } I want to cast "mappings" column into a MAP like select CAST("mappings" as MAP) from hello; This…
7
votes
1 answer

Spark/SQL 2.4 checksum(), md5, etc

SELECT checksum(c) FROM ( SELECT to_hex( md5( to_utf8( concat( trim(coalesce("dcpheader_dcploaddayutc",'')), trim(coalesce("dcpheader_dcploadmonthutc",'')) ) ) ) ) -- end of to_hex, md5, to_utf () …
Acid Rider
  • 1,557
  • 3
  • 17
  • 25
7
votes
4 answers

HIVE_METASTORE_ERROR expected 'STRING' but 'STRING' is found

I've been unable to get any query to work against my AWS Glue Partitioned table. The error I'm getting is HIVE_METASTORE_ERROR: com.facebook.presto.spi.PrestoException: Error: type expected at the position 0 of 'STRING' but 'STRING' is found. …
Zambonilli
  • 4,358
  • 1
  • 18
  • 18
7
votes
2 answers

Presto: cast array>> into map>

I have a table like name string address string timezone string one_key_value …
John Constantine
  • 411
  • 1
  • 6
  • 15
7
votes
2 answers

Querying an Athena table partitioned by year, month, day

I have an Athena table partitioned by year, month, day with the following definition CREATE EXTERNAL TABLE `my_table`( `price` double) PARTITIONED BY ( `year` int, `month` int, `day` int) ROW FORMAT SERDE …
Shirkan
  • 859
  • 1
  • 9
  • 14
7
votes
4 answers

AWS Athena query error when trying to filter by date

I am trying to use Athena to query some data I have stored in an s3 bucket in parquet format. I have field called datetime which is defined as a date data type in my AWS Glue Data Catalog. When I try running the following query in Athena, I get the…
0xPeter
  • 546
  • 1
  • 6
  • 15
7
votes
1 answer

Presto failed: com.facebook.presto.spi.type.VarcharType

I created a table with three columns - id, name, position, then I stored the data into s3 using orc format using spark. When I query select * from person it returns everything. But when I query from presto, I get this error: Query…
Abdulhafeth Sartawi
  • 1,086
  • 1
  • 11
  • 20
7
votes
5 answers

How to deduplicate in Presto

I have a Presto table assume it has [id, name, update_time] columns and data (1, Amy, 2018-08-01), (1, Amy, 2018-08-02), (1, Amyyyyyyy, 2018-08-03), (2, Bob, 2018-08-01) Now, I want to execute a sql and the result will be (1, Amyyyyyyy,…
Archon
  • 1,385
  • 1
  • 15
  • 30
7
votes
1 answer

Split one row into multiple rows based on comma-separated string column

I have a table like below with columns A(int) and B(string): A B 1 a,b,c 2 d,e 3 f,g,h I want to create an output like below: A B 1 a 1 b 1 c 2 d 2 e 3 f 3 g 3 h If it helps, I am doing this in Amazon Athena…
ishan3243
  • 1,870
  • 4
  • 30
  • 49
7
votes
1 answer

Cannot cast '' to bigint in presto

I tried to clear 'null' in my query, but there is still error when run it, keep telling me that 'cannot cast '' to bigint, is there any way to fix it? myquery select m.app, m.hour,m.user_id, m.avg_minutes_watched, n.userid, n.watched_mins,…
joey
  • 115
  • 1
  • 1
  • 10
7
votes
2 answers

How to convert result of Presto `ST_Distance` to meters?

I am trying to figure out a way to convert the result of presto geo spatial function ST_DISTANCE to meters. If I run the this example query: SELECT ST_Distance(ST_Point(0.3476, 32.5825),ST_Point(0.0512, 32.4637)) The result I get from Presto…
user2293727
  • 71
  • 1
  • 4