0

I am trying to ingest data from Presto database using Hadoop Sqoop which is throwing this error in Presto :

18/07/12 10:34:05 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection.
18/07/12 10:34:06 ERROR manager.SqlManager: Error executing statement: java.sql.SQLException: [Teradata][Presto](100050) Query failed: Current transaction is aborted, commands ignored until end of transaction block.
java.sql.SQLException: [Teradata][Presto](100050) Query failed: Current transaction is aborted, commands ignored until end of transaction block.
    at com.teradata.presto.presto.PRUtils.parseError(Unknown Source)
    at com.teradata.presto.presto.dataengine.PRResultSet.parseDataString(Unknown Source)
    at com.teradata.presto.presto.dataengine.PRResultSet.execute(Unknown Source)
    at com.teradata.presto.presto.dataengine.PRResultSet.<init>(Unknown Source)
    at com.teradata.presto.presto.dataengine.PRQueryExecutor.<init>(Unknown Source)
    at com.teradata.presto.presto.dataengine.PRDataEngine.prepare(Unknown Source)
    at com.teradata.presto.jdbc.common.SPreparedStatement.<init>(Unknown Source)
    at com.teradata.presto.jdbc.jdbc41.S41PreparedStatement.<init>(Unknown Source)
    at com.teradata.presto.jdbc.jdbc42.S42PreparedStatement.<init>(Unknown Source)
    at com.teradata.presto.jdbc.jdbc42.JDBC42ObjectFactory.createPreparedStatement(Unknown Source)
    at com.teradata.presto.presto.jdbc42.PRJDBC42ObjectFactory.createPreparedStatement(Unknown Source)
    at com.teradata.presto.jdbc.common.SConnection.prepareStatement(Unknown Source)
    at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:744)
    at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:767)
    at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:270)
    at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:241)
    at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:227)
    at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295)
    at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1833)
    at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1645)
    at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107)
    at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
    at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
    at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)

Where as there are no errors on the Presto DB side. Sqoop command:

sqoop-import \
--verbose \
--driver com.teradata.presto.jdbc42.Driver
--connect 'jdbc:presto://bla:8443/hive?SSL=true&SSLTrustStorePath=bla&SSLTrustStorePassword=bla' \
--username bla -P \
--table hive.bla.blah \
--m 2

Teradata Presto Driver version: PrestoJDBC42-1.0.21.1031.jar

Manfred Moser
  • 29,539
  • 13
  • 92
  • 123
Sudheer Palyam
  • 2,499
  • 2
  • 23
  • 28
  • 1
    I'm pretty sure that error is because the previous query failed. Look in the logs for a failed query before this one. – Dain Sundstrom Jul 12 '18 at 15:35
  • 1
    Have you tried using the Presto JDBC driver? – David Phillips Jul 12 '18 at 17:17
  • Yes @DainSundstrom, indeed there was an error prior to this. This is the error: com.facebook.presto.spi.PrestoException: Connector supported isolation level READ UNCOMMITTED does not meet requested isolation level READ COMMITTED Though sqoop default is "READ COMMITTED", i dont know why the query is being made as "READ UNCOMITTED" isolation level. May be Presto is not retaining isolation level as part of the Transaction sequence. Please let me know if you have any opinion on this. – Sudheer Palyam Jul 13 '18 at 01:49
  • Yes @DavidPhillips, I did try with both com.facebook.presto.jdbc.PrestoDriver & com.teradata.presto.jdbc42.Driver. Same issue. – Sudheer Palyam Jul 13 '18 at 01:49
  • 2
    Each thing Presto connects to has different capabilities. When a catalog gets added to a transaction the system verifies that the connector can meet the requirements of the transaction. For Hive the connector only supports READ UNCOMMITTED because some HDFS clients keep uncommitted data in partition output (e.g., standard s3 behavior). – Dain Sundstrom Jul 13 '18 at 22:49
  • just a quick clarification: it's a conceptual error to think of ingesting "from presto database". Presto's a query engine that in turn talks to real databases, as well as file-based datasets and anything else for which you have a connector. So as @DainSundstrom indicates, the capabilities of the underlying sources and the connectors on top of them will determine what's possible. – Carnot Antonio Romero Sep 13 '18 at 21:33

1 Answers1

2

Hive connector supports only READ_UNCOMMITTED while it looks like sqoop is requesting READ_COMMITTED which cannot be supported. If could change sqoop to READ_UNCOMMITED then it should work.

kokosing
  • 5,251
  • 5
  • 37
  • 50