Questions tagged [presto]

Presto is an open source distributed SQL query engine for running analytic queries against data sources of all sizes ranging from gigabytes to petabytes. The community version of Presto is now called Trino. Amazon serverless query service called Athena is using Presto under the hood.

What is Presto?

Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.

Presto was designed and written from the ground up for interactive analytics and approaches the speed of commercial data warehouses while scaling to the size of organizations like Facebook.

What can it do?

Presto allows querying data where it lives, including Hive, HBase, relational databases or even proprietary data stores. A single Presto query can combine data from multiple sources, allowing for analytics across your entire organization.

Presto is targeted at analysts who expect response times ranging from sub-second to minutes. Presto breaks the false choice between having fast analytics using an expensive commercial solution or using a slow "free" solution that requires excessive hardware.

References

3114 questions
1
vote
1 answer

SqlStageExecution Error in executing Presto query

I am trying to run Presto on Amazon. I have just one node on which I configured Presto server. I haven't setup Presto on other nodes in cluster yet. Trying out a simple select query throws the following exception. Any insights. is it coming because…
user2980461
  • 99
  • 3
  • 4
1
vote
2 answers

Can Presto connect to other Hadoop distributions and run queries

I see Presto has plugin only to CDH4. Can I connect to other distributions such as HortonWorks from this and what does it take to do it. Without a specific plugin, I am running into "path host null" errors when executing queries from Presto.…
user2980461
  • 99
  • 3
  • 4
1
vote
1 answer

Executing Presto Task for QA and Production but not in Dev

I have a task that needs to run in QA and prod, but not dev. The task is to stop a clustered application. The problem is that the dev servers aren’t clustered and the task to stop the cluster fails on these servers. Is there a way to handle this?
Jim
  • 4,910
  • 4
  • 32
  • 50
1
vote
1 answer

How do I have Presto create a Windows service with a specific user name and password?

How do I have Presto create a Windows service with a specific user name and password? I'd post what I tried, but I'm not really sure where to start.
Jim
  • 4,910
  • 4
  • 32
  • 50
0
votes
0 answers

Not available to fetch last modified date in AWS athena using Presto SQL

I want to write a query In AWS Athena which will give me the last but latest modified date in that column but instead it is giving me a date of years ago. While I know the latest date is around 2021 year SELECT 'Direct' AS "ERP System", 'MYTABLE' AS…
0
votes
1 answer

Getting Blank Values while extracting data from list of JSON in String column using AWS Athena

I have a table with below columns and data types. id string name string title string listItems string purchase_list table: id name title listItems 123 Peter Purchase List [{"manufacture_date":"2023-01-01","purchase_price":"20.0"},…
vvazza
  • 421
  • 7
  • 21
0
votes
0 answers

Hue does not work with https to connect Presto

the connection is closed when I try to access Hue by browser via https. When I change https to http in the browser I can access, but after loging in, it gives the following error: HTTPSConnectionPool(host='@myhost.mydomain', port=8443): Max retries…
0
votes
1 answer

Does ORDER BY matter for performance in Aws Athena Presto?

When you use CTAS queries to create new tables, you can add an ORDER BY. You have to do this when you combine it with bucket_by. If you don't bucket, will an ORDER BY still matter for your performance?
Roelant
  • 4,508
  • 1
  • 32
  • 62
0
votes
1 answer

How to Grab the December 1st of This Current Year

Example: I want to grab the 1st of December of the current year using thecurrent date. If the year changes, for example the current_date is now 2024-01-01 I want the 1st of December of the current year for that date instead. Is that possible?
Maggie Liu
  • 344
  • 1
  • 3
  • 15
0
votes
1 answer

Associate Record to Nearest Manager

I have a table (t_budget) that lists the IDs of employees who are responsible for a given budget along with the allocated amount & spend against budget. This table is the result of a larger query that first targets HR data to get list of budget…
urdearboy
  • 14,439
  • 5
  • 28
  • 58
0
votes
3 answers

How to extract last part of a string using regex

I want to use part of a string in a where statement, the issue is to be able to correctly extract the ending part of the string. example of the string is…
icenature
  • 39
  • 4
0
votes
1 answer

I have a table 'A' which has 119940377 records and 'a' table is fetching values from 4 different tables based on priority.I want to find count(a)

So if below are the records from four tables 1st table 2nd table 3rd table 4th table 1 1 2. 3 2 2. 6. 4 3. 5. 7. 10 Then A table has 1,2,3 from 1st table,5 from 2nd table,6,7 from third table and 4 and 10 from 4th table. with a as…
FFchin
  • 1
  • 2
0
votes
0 answers

Conditional columns based on values in other columns in SQL

I have a SQL that aggregates the monthly ticketing of different organisations to give an output like this - Organization Course Month Last Updated Status Tickets This Month Total Tickets Deleted…
harry04
  • 900
  • 2
  • 9
  • 21
0
votes
1 answer

SQL Athena/Presto to convert time duration to DD HH MM format

I have tables with 2 columns of timestamp. My goal is to have the difference between those 2 timestamp columns, but the format should be DD HH MM. The table looks like below: Using date_diff function, I can get the time difference in minutes, but…
KaraiKare
  • 155
  • 1
  • 2
  • 10
0
votes
0 answers

Flink SQL throw "java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream" Error

I want to create a hudi table in my s3 storage,I followed the Official documentation, added these configuration in flink-conf.yml: fs.allowed-fallback-filesystems: s3 state.backend: filesystem state.checkpoints.dir:…
Dodge_X
  • 37
  • 8
1 2 3
99
100