How can I increase the maximum query time?

Question

I ran a query which will eventually return roughly 17M rows in chunks of 500,000. Everything seemed to be going just fine, but I ran into the following error:

Traceback (most recent call last):
File "sql_csv.py", line 22, in <module>
    for chunk in  pd.read_sql_query(hours_query, db.conn, chunksize = 500000):
File "/Users/michael.chirico/anaconda2/lib/python2.7/site-packages/pandas/io/sql.py", line 1424, in _query_iterator
    data = cursor.fetchmany(chunksize)
File "/Users/michael.chirico/anaconda2/lib/python2.7/site-packages/jaydebeapi/\__init__.py", line 546, in fetchmany
    row = self.fetchone()
File "/Users/michael.chirico/anaconda2/lib/python2.7/site-packages/jaydebeapi/\__init__.py", line 526, in fetchone
    if not self._rs.next(): jpype._jexception.SQLExceptionPyRaisable: java.sql.SQLException: Query failed (#20171013_015410_01255_8pff8):
**Query exceeded maximum time limit of 60.00m**

Obviously such a query can be expected to take some time; I'm fine with this (and chunking means I know I won't be breaking any RAM limitations -- in fact the file output I was running shows the query finished 16M of the 17M rows before crashing!).

But I don't see any direct options for read_sql_query. params seems like a decent candidate, but I can't see in the jaydebeapi documentation any hint of what the right parameter to give to execute might be.

How can I overcome this and run my full query?

sorry I don't familiar with neither JDBC or presto, only guessing from the stacktrace. — georgexsh, Oct 13 '17 at 11:05
@georgexsh you're correct, after all. Ran into the same problem querying the DB directly. — MichaelChirico, Oct 16 '17 at 09:36
no solution since I don't have administrative rights over setting the DB query limit :/ — MichaelChirico, Oct 17 '17 at 14:56

Sayat Satybald · Accepted Answer · 2018-05-03T09:29:04.733

2

When executing queries, Presto restricts each query by CPU, memory, execution time and other constraints. You hit execution time limit. Please ensure that your query is sound, otherwise, you can crash the cluster.

To increase query execution time, define a new value in session variables.

SET SESSION query_max_execution_time=60m;

edited May 03 '18 at 09:29

answered Apr 26 '18 at 17:15

Sayat Satybald

6,300
5
35
52

score 1 · Answer 2 · answered Feb 26 '19 at 15:57

1

To override maximum query execution time, add this parameter in CLI:

SET SESSION query_max_execution_time='60m';

This will change it to 60 minutes, notice the single quotes.

answered Feb 26 '19 at 15:57

khanigoo

11
1

Karthik · Answer 3 · 2022-06-27T11:20:45.767

0

need to add it as a comment before the query --SET SESSION query_max_execution_time='60m'

edited Jun 27 '22 at 11:20

answered Jun 27 '22 at 11:20

Karthik

1
1

The answer looks the same as the answers above. – Jun 28 '22 at 06:45

How can I increase the maximum query time?

3 Answers3