Questions tagged [presto]

Presto is an open source distributed SQL query engine for running analytic queries against data sources of all sizes ranging from gigabytes to petabytes. The community version of Presto is now called Trino. Amazon serverless query service called Athena is using Presto under the hood.

What is Presto?

Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.

Presto was designed and written from the ground up for interactive analytics and approaches the speed of commercial data warehouses while scaling to the size of organizations like Facebook.

What can it do?

Presto allows querying data where it lives, including Hive, HBase, relational databases or even proprietary data stores. A single Presto query can combine data from multiple sources, allowing for analytics across your entire organization.

Presto is targeted at analysts who expect response times ranging from sub-second to minutes. Presto breaks the false choice between having fast analytics using an expensive commercial solution or using a slow "free" solution that requires excessive hardware.

References

3114 questions
6
votes
1 answer

What is $path pseudo column? What is the use of it in Athena (Presto)?

What is exactly "$path" used for? I just ran "select "$path" from table limit 10", in athena it's showing the file path of S3 where data is pointed. But when i gave limit 10, it's showing same path 10 times, if i don't limit the statement it's…
Roy
  • 109
  • 1
  • 8
6
votes
3 answers

Amazon athena can't read S3 JSON Object files and Athena select query returns empty result sets for JSON key columns

I create a table in Athena with below structure CREATE EXTERNAL TABLE s3_json_objects ( devId string, type string, status string ) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' WITH SERDEPROPERTIES ( 'ignore.malformed.json' =…
SamDev
  • 185
  • 2
  • 11
6
votes
2 answers

presto - getting days interval (not date)

How do I get the days interval for prestodb? I can convert to milliseconds and convert these to number of days but I am looking if there is any shorter way to do this. Example: I want to see how many days has it been since the first row inserted in…
addicted
  • 2,901
  • 3
  • 28
  • 49
6
votes
2 answers

Presto with Kubernetes

We are trying to implement Presto with Kubernetes. We have a kubernetes cluster running on cloud as a service. I tried to google on this but could not find a conclusive result as to what may be the best practices to deploy Presto with Kubernetes.…
Anshul Verma
  • 1,065
  • 1
  • 9
  • 26
6
votes
1 answer

Presto SQL: TO_UNIXTIME

I want to convert a readable timestamp to UNIX time. For example: I want to convert 2018-08-24 18:42:16 to 1535136136000. Here is my syntax: TO_UNIXTIME('2018-08-24 06:42:16') new_year_ut My error is: SYNTAX_ERROR: line 1:77: Unexpected…
noobeerp
  • 417
  • 2
  • 6
  • 11
6
votes
2 answers

AWS Athena json_extract query from string field returns empty values

I have a table in athena with this structure CREATE EXTERNAL TABLE `json_test`( `col0` string , `col1` string , `col2` string , `col3` string , `col4` string , ) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH…
Fernando Byn
  • 97
  • 1
  • 2
  • 7
6
votes
0 answers

Can't read data in Presto - can in Hive

I have a Hive DB - I created a table, compatible to Parquet file type. CREATE EXTERNAL TABLE `default.table`( `date` date, `udid` string, `message_token` string) PARTITIONED BY ( `dt` date) ROW FORMAT SERDE …
Bramat
  • 979
  • 4
  • 24
  • 40
6
votes
1 answer

How do I find percentages of a column using Hive/Presto

Let's say I have a table that looks like: Reason | Duration Eating 40 Drinking 60 Everything Else 100 How do I get a table like this: Reason | Duration | Duration Percent Eating 40 …
UtsavShah
  • 857
  • 1
  • 9
  • 20
6
votes
1 answer

How can I check the partition list from Athena in AWS?

I want to check the partition lists in Athena. I used query like this. show partitions table_name But I want to search specific table existed. So I used query like below but there was no results returned. show partitions table_name…
Bethlee
  • 825
  • 3
  • 17
  • 28
6
votes
1 answer

Does presto require a hive metastore to read parquet files from S3?

I am trying to generate parquet files in S3 file using spark with the goal that presto can be used later to query from parquet. Basically, there is how it looks like, Kafka-->Spark-->Parquet<--Presto I am able to generate parquet in S3 using Spark…
Dangerous Scholar
  • 225
  • 1
  • 3
  • 12
6
votes
3 answers

SQL Summing digits of a number

i'm using presto. I have an ID field which is numeric. I want a column that adds up the digits within the id. So if ID=1234, I want a column that outputs 10 i.e 1+2+3+4. I could use substring to extract each digit and sum it but is there a function…
Moosa
  • 3,126
  • 5
  • 25
  • 45
6
votes
3 answers

Presto and hive partition discovery

I'm using presto mainly with hive connector to connect to hive metastore. All of my tables are external tables pointing to data stored in S3. My main issue with this is that there is no way (at least on I'm aware of ) to do partition discovery in…
Lior Baber
  • 852
  • 3
  • 11
  • 25
6
votes
3 answers

Timerange Table SQL Presto

I need to use a temporary timerange table for my SQL query in treasure data presto: CREATE TEMPORARY TABLE fakehours (Hour BIGINT); INSERT INTO Hour VALUES (0); INSERT INTO Hour VALUES (1); INSERT INTO Hour VALUES (2); INSERT INTO Hour VALUES…
Javier
  • 249
  • 1
  • 6
  • 16
6
votes
1 answer

User Defined Functions in Presto

I am currently working with Presto 0.80. I have to write a user defined function to convert degree celsius to degree fahrenheit during select query. I did the same using Hive QL but was wondering if we can replicate the same in Facebook Presto. Any…
user3339340
  • 71
  • 1
  • 1
  • 3
6
votes
1 answer

How to access a json field with "~" in the field name with Presto JSON functions

I have a "~" in my json fields, such as "~id". Using Presto 0.75, I am unable to access such fields. Following is what I have tried so far without success: SELECT json_extract_scalar('{"id":"1","table":"test"}', '$.table'); // This works SELECT…
l8Again
  • 263
  • 3
  • 9