Problem
- Iterating over a large
ResultSet
from a Presto query in using JDBC Driver is extremely time consuming. Each batch call seems to take almost exactly 60s, which is suspicious (caused by some timeout maybe?). - For some reason, the initial executeQuery method takes almost exactly 45s, which is suspicious.
Query
I'm executing a simple query against Presto using Java JDBC Presto Driver that looks essentially like:
SELECT stringA, stringB
FROM {table}
LIMIT 500000
stringA
and stringB
are small - on the order of ~10 characters each.
My query completes under 10sec when run in DbVisualizer using a teradata driver.
BUT when I run the same query from a Spring Java app using the 0.230 presto-jdbc driver it seems to get results back in batches (of about 75,000) and each batch takes over a minute to return.
I've done some reading about Presto's targetResultSize
query param but I haven't been able to set it using the JDBC driver / connection. I've read that presto by default will only return data 1MB at a time? Not sure if that's the cause for my above stated problem #1 - would be great to figure out how to configure that.
Java Code
public List<Object> getResultSetUsingDriverManager(ChronoLocalDate chronoLocalDate) throws SQLException {
long start = System.currentTimeMillis();
Properties properties = new Properties();
properties.setProperty("user", USERNAME);
properties.setProperty("password", PASSWORD);
properties.setProperty("SSL", "true");
final Connection connection = DriverManager.getConnection(URL, properties);
log.warn("Presto connection acquired in " + (System.currentTimeMillis() - start) + "ms");
Statement statement = connection.createStatement();
ResultSet resultSet = statement.executeQuery(query);
log.warn("Presto query executed in " + (System.currentTimeMillis() - start) + "ms");
List<Object> collection = new ArrayList<>();
int counter = 0;
long batchStart = System.currentTimeMillis();
while (resultSet.next()) {
counter++;
if (counter % 1000 == 0) {
log.warn("current count {} and took {}ms", counter, (System.currentTimeMillis() - batchStart));
batchStart = System.currentTimeMillis();
}
}
log.warn("Results extracted in " + (System.currentTimeMillis() - start));
return collection;
}
Output
2020-01-08 17:34:31.704 WARN 29368 --- ... : Presto connection acquired in 0ms
2020-01-08 17:35:16.705 WARN 29368 --- ... : Presto query executed in 45003ms
2020-01-08 17:37:18.242 WARN 29368 --- ... : current count 1000 and took 121537ms
2020-01-08 17:37:18.244 WARN 29368 --- ... : current count 2000 and took 2ms
2020-01-08 17:37:18.245 WARN 29368 --- ... : current count 3000 and took 1ms
...
2020-01-08 17:37:18.294 WARN 29368 --- ... : current count 75000 and took 1ms
2020-01-08 17:38:18.857 WARN 29368 --- ... : current count 76000 and took 60563ms
2020-01-08 17:38:18.858 WARN 29368 --- ... : current count 77000 and took 1ms
...
2020-01-08 17:38:18.941 WARN 29368 --- ... : current count 151000 and took 0ms
2020-01-08 17:39:19.241 WARN 29368 --- ... : current count 152000 and took 60300ms
2020-01-08 17:39:19.242 WARN 29368 --- ... : current count 153000 and took 1ms
...
2020-01-08 17:39:19.311 WARN 29368 --- ... : current count 250000 and took 0ms
2020-01-08 17:39:19.311 WARN 29368 --- ... : Results extracted in 287609
Version Information
- Java 11
- com.facebook.presto presto-jdbc 0.230
- Spring Boot 2.1.6.RELEASE
- Presto version : 302-e.3 (Starburst version)