Questions tagged [trino]

Trino is an open source distributed SQL query engine for running analytic queries against data sources of all sizes ranging from gigabytes to petabytes. Trino is the community version of Presto, and emerged from the rename of the PrestoSQL codebase.

What is Trino?

Trino is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.

Trino, formerly Presto, was designed and written from the ground up for interactive analytics and approaches the speed of commercial data warehouses while scaling to the size of very large organizations.

What can it do?

Trino allows querying data where it lives, including HDFS/Hive, object storage systems like S3, Iceberg, Delta Lake, many relational databases, NoSQL/document databases, and even proprietary data stores. A single Trino query, written in standard SQL, can combine data from multiple sources, allowing for analytics across your entire organization.

Trino is targeted at analysts who expect response times ranging from sub-second to minutes. Trino breaks the false choice between having fast analytics using an expensive commercial solution or using a slow "free" solution that requires excessive hardware.

The massively parallel processing design of Trino allows very high performance, and horizontal scaling to adjust to your needs.

References

680 questions
4
votes
0 answers

Spark concurrency performance issue Vs Presto

We are benchmarking spark with alluxio and presto with alluxio. For evaluating the performance we took 5 different queries (with some joins, group by and sort) and ran this on a dataset 650GB in orc. Spark execution environment is setup in such a…
Rijo Joseph
  • 1,375
  • 3
  • 17
  • 33
4
votes
2 answers

Does Presto cache intermediate results internally out of the box?

Presto has multi connectors. While the connectors do implement read and write operations, from all the tutorials I read, it seems they are typically used as data sources to read from only. For example, netflix has "10 petabyte" of data on Amazon S3…
Jan
  • 723
  • 1
  • 11
  • 24
3
votes
1 answer

How to create a new column in Athena table from an array based on a specific key value?

We use QuickSight for visualizing cost data. GCP exports a lot of their billing data fields as arrays, but AWS QuickSight can't import arrays currently. So, I am trying to create an Athena view by pulling certain values out of the arrays and putting…
Vera
  • 31
  • 4
3
votes
1 answer

SQL How do I generate a row with value 0 of customers in days without order

I would like to create a query that takes all the dates of the year 2023 (in this case I'm using the order table as an auxiliary table, which is for order registration to get the dates, it always has a date for every day) and then give left join…
vichay
  • 33
  • 4
3
votes
1 answer

How to apply same column id(value) for different rank values based on partition columns

I have a table with columns and values as given below and trying to append the same id for the different rank column values which has different id value based on rank but needed to append same id column value for rank values other than 1 also. have…
Gowtham M
  • 31
  • 2
3
votes
2 answers

unnesting empty or null array leading to missing rows

I'm using Trino/Presto and trying to unnest array column which can contain rows with empty or null arrays which results in such rows missing: with table1(id, arr) as ( values (1, array[1,2,3]), (2, array[]), (3,…
Suricat
  • 35
  • 4
3
votes
1 answer

How do you define a primary key on a table in Trino?

The query below does not work. CREATE TABLE test_table (date varchar, id varchar, PRIMARY KEY (date,id)) I can't seem to find any docs on primary keys in Trino.
Paul
  • 1,101
  • 1
  • 11
  • 20
3
votes
1 answer

Trying to understand Trino and how it handles Time Zones

I'm observing strange behaviors when trying to convert timestamps between time zones in Trino. I believe it may be due to some conversions not behaving as expected. Perhaps someone can explain why CAST(current_timestamp as timestamp) observes the…
3
votes
0 answers

Automatic partition addition to hive metastore in a Presto setup

Source: S3 Query Engine: Athena In our use case, several partitions (and hence, files) are added to S3 continuously and the partitions are made available immediately via dynamic partition projection. So the cost of adding partitions is not…
TJ-
  • 14,085
  • 12
  • 59
  • 90
3
votes
1 answer

Adding hours to dates in Presto

In a Presto db table, I have two string fields, a date of the form, '2022-01-01', and an hour of the form, 22, for 22:00. I'm trying to combine these two elements into a proper timestamp, with date, hours, minutes, seconds. How can I accomplish…
jbuddy_13
  • 902
  • 2
  • 12
  • 34
3
votes
0 answers

JypterLab cannot Authenticate with Trino (PrestoSQL) using OAUTH2 Token

I am using Trino to connect to PrestoSQL for my organization in the manner below with python. The MFA authentication requires that I click a link to authenticate. The links usually look something like this:…
albajoin
  • 51
  • 1
  • 3
3
votes
2 answers

Example for CREATE TABLE on TRINO using HUDI

I am using Spark Structured Streaming (3.1.1) to read data from Kafka and use HUDI (0.8.0) as the storage system on S3 partitioning the data by date. (no problems with this section) I am looking to use Trino (355) to be able to query that data. As a…
gunj_desai
  • 782
  • 6
  • 19
3
votes
1 answer

Difference between two presto jdbc

few days ago, I sent a query using presto. it's really simple query like " select * from table limit 3; " but, jdbc error was occured. I check my driver. At that time, I used PrestoDB driver. Class name was…
carrot
  • 47
  • 2
3
votes
1 answer

Prestosql/Amazon Athena: Time Zone Change

I need to change a UTC timestamp to 'US/Eastern' timestamp without changing the date and time - essentially update only the timezone information and later convert that to a different timezone. For example (what I need): '2021-06-09 19:00:36.000000'…
3
votes
0 answers

Are there any presto or trino connectors for Solr / SolrCloud?

I need to run some scheduled ;SQL statements for SolrCloud programmatically in a dashboard joining several collections, the only option I found so far is by using Presto however I could not find any Solr connectors within Presto or Trino, so far I…
Haf
  • 31
  • 2