Questions tagged [trino]

Trino is an open source distributed SQL query engine for running analytic queries against data sources of all sizes ranging from gigabytes to petabytes. Trino is the community version of Presto, and emerged from the rename of the PrestoSQL codebase.

What is Trino?

Trino is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.

Trino, formerly Presto, was designed and written from the ground up for interactive analytics and approaches the speed of commercial data warehouses while scaling to the size of very large organizations.

What can it do?

Trino allows querying data where it lives, including HDFS/Hive, object storage systems like S3, Iceberg, Delta Lake, many relational databases, NoSQL/document databases, and even proprietary data stores. A single Trino query, written in standard SQL, can combine data from multiple sources, allowing for analytics across your entire organization.

Trino is targeted at analysts who expect response times ranging from sub-second to minutes. Trino breaks the false choice between having fast analytics using an expensive commercial solution or using a slow "free" solution that requires excessive hardware.

The massively parallel processing design of Trino allows very high performance, and horizontal scaling to adjust to your needs.

References

680 questions
5
votes
2 answers

Does Presto SQL support recursive query using CTE just like SQL Server? e.g. employee hierarchy level

I want to write a recursive query using CTE in Presto to find Employee Hierarchy. Do Presto support recursive query? When I write simple recursion as with cte as(select 1 n union all select cte.n+1 from cte where n<50) select…
vaibhav
  • 87
  • 2
  • 6
5
votes
2 answers

Python compiled script giving error of "Can't load plugin: sqlalchemy.dialects:presto"

I compiled .py file with pyinstaller as follows: pyinstaller --hidden-import presto --hidden-import scipy._lib.messagestream --onefile main.py When I ran the compiled file, I got the error: sqlalchemy.exc.NoSuchModuleError: Can't load plugin:…
nullne
  • 511
  • 6
  • 12
5
votes
0 answers

How to handle hive locking across hive and presto

I have a few hive tables that are insert-overwrite from spark and hive. Those tables are also accessed by analysts on presto. Naturally, we're running into some windows of time that users are hitting an incomplete data set because presto is ignoring…
5
votes
3 answers

how the presto shows partitions before presto execute hql?

I used pyhive to connect hive to use Presto. May I know the partitions of the hive tables before presto has executed the sql?
user3065606
  • 235
  • 1
  • 4
  • 13
5
votes
1 answer

What are the fundamental architectural, SQL compliance, and data use scenario differences between Presto and Impala?

Can some experts give some succinct answers to the differences between Presto and Impala from these perspectives? Fundamental architecture design SQL compliance Real-world latency Any SPOF or fault-tolerance functionality Structured and…
Yellow Duck
  • 261
  • 1
  • 4
  • 14
4
votes
0 answers

How to consume Trino data with GraphQL

We have a dataplatform built around Trino and Privacera and it is working well. It is possible to query data from dozens of sources through trino and authorization with Privacera works like a charm. Then we have a new requirement stating that they…
Bagaboo
  • 343
  • 3
  • 17
4
votes
1 answer

generate date range between min and max dates Athena presto SQL sequence error

I'm attempting to generate a series of dates in Presto SQL (Athena) using unnest and sequence something similair to generate_series in postgres. my table looks like job_name | run_date A | '2021-08-21' A | '2021-08-25' B |…
Umar.H
  • 22,559
  • 7
  • 39
  • 74
4
votes
0 answers

Change the file format used by to_sql method

This works as expected and creates a new table. But the data is stored in a format that only spark can read. How do I store the data in csv format? from pyathena.pandas.util import to_sql to_sql( mrdf, "mrdf_table3", conn, "s3://" +…
shantanuo
  • 31,689
  • 78
  • 245
  • 403
4
votes
1 answer

SQL presto - cross join unnest null value

I have arrays of different sizes and I want each value in the array to be in separate rows. To do that, I have used the cross join unnest. It is working however, it is deleting null array. So, I have my column ID with the different arrays and some…
Aude Hamdi
  • 53
  • 1
  • 3
4
votes
2 answers

why does AWS Athena needs 'spill-bucket' when it dumps results in target S3 location

why does AWS Athena needs 'spill-bucket' when it dumps results in target S3 location WITH ( format = 'Parquet', parquet_compression = 'SNAPPY', external_location = '**s3://target_bucket_name/my_data**' ) AS WITH my_data_2 AS (SELECT * FROM…
user1379280
  • 273
  • 1
  • 3
  • 18
4
votes
2 answers

Creating bins in presto sql - programmatically

I am new to Presto SQL syntax and and wondering if a function exists that will bin rows into n bins in a certain range. For example, I have a a table with 1m different integers that range from 1 - 100. What can I do to create 20 bins between 1 and…
iskandarblue
  • 7,208
  • 15
  • 60
  • 130
4
votes
1 answer

Prestosql converting UTC timestamp to local?

How can I convert a timestamp field that includes day and time to local time in Prestosql? The fields look like Region ID | Start Time utc | End Time utc abc 2019-04-26 20:00:00.000 2019-04-26 23:00:00.000 cdx …
Chris90
  • 1,868
  • 5
  • 20
  • 42
4
votes
3 answers

reduce the amount of data scanned by Athena when using aggregate functions

The below query scans 100 mb of data. select * from table where column1 = 'val' and partition_id = '20190309'; However the below query scans 15 GB of data (there are over 90 partitions) select * from table where column1 = 'val' and partition_id in…
Punter Vicky
  • 15,954
  • 56
  • 188
  • 315
4
votes
1 answer

How to unnest multiple columns in presto, outputting into corresponding rows

I'm trying to unnest some code I have a a couple of columns that have arrays, both columns using | as a deliminator The data would be stored looking like this, with extra values to the side which show the current currency I want to output it like…
Aki
  • 137
  • 1
  • 4
  • 17
4
votes
0 answers

Migrating tables from Hive to Cassandra - using COPY

I'm migrating tables from Hive/HDFS (Using Presto to speed up migration) to Cassandra v3.11.3, my question - Can I use any other method which will be easy? as I have less time and lot of tables to move. I have tried exporting tables from hive to…
Hareesha
  • 125
  • 9
1 2
3
45 46