1

I'm trying to do some testing on our JDBC driver using Python.

Initially figuring out JPype, I eventually managed to connect the driver and execute select queries like so (reproducing a generalized snippet):

from __future__ import print_function
from jpype import *

#Start JVM, attach the driver jar
jvmpath = 'path/to/libjvm.so'
classpath = 'path/to/JDBC_Driver.jar'
startJVM(jvmpath, '-ea', '-Djava.class.path=' + classpath)

# Magic line 1
driver = JPackage('sql').Our_Driver

# Initiating a connection via DriverManager()
jdbc_uri = 'jdbc:our_database://localhost:port/database','user', 'passwd')  
conn = java.sql.DriverManager.getConnection(jdbc_uri)

# Executing a statement
stmt = conn.createStatement()
rs = stmt.executeQuery ('select top 10 * from some_table')

# Extracting results
while rs.next():
    ''' Magic #2 - rs.getStuff() only works inside a while loop '''
    print (rs.getString('col_name'))

However, I've failed to to batch inserts, which is what I wanted to test. Even when executeBatch() returned a jpype int[], which should indicate a successful insert, the table was not updated.

I then decided to try out py4j.

My plight - I'm having a hard time figuring out how to do the same thing as above. It is said py4j does not start a JVM on its own, and that the Java code needs to be prearranged with a GatewayServer(), so I'm not sure it's even feasible.

On the other hand, there's a library named py4jdbc that does just that.

I tinkered through the dbapi.py code but didn't quite understand the flow, and am pretty much jammed.

If anyone understands how to load a JDBC driver from a .jar file with py4j and can point me in the right direction, I'd be much grateful.

Jay
  • 2,535
  • 3
  • 32
  • 44

3 Answers3

3

add a commit after adding the records and before retrieving.

conn.commit()
brddawg
  • 434
  • 8
  • 19
  • well, sh**.. thank you. However, getting a int[] back was rare and random, mostly i would not get a return value after an executeBatch(), even when the setInt(), setString(), addBatch() went through. – Jay Sep 21 '17 at 21:39
  • A JDBC driver by default has to be in auto commit mode. Calling `commit()` should throw an exception in that situation. – Mark Rotteveel Sep 22 '17 at 07:37
  • here's a string for the auto commit: java.sql.Connection.setAutocommit(). Also, something like sqlalchemy might be useful. – brddawg Sep 22 '17 at 13:30
  • How/Why would I use SQLAlchemy? – Jay Sep 22 '17 at 13:39
  • 1
    http://docs.sqlalchemy.org/en/latest/core/tutorial.html under the heading 'Selecting'. Shows how to use their module to query a database native to python and avoid sending strings as queries. Just learning about this module myself. – brddawg Sep 22 '17 at 13:47
  • I'm interested in batch inserts over JDBC, but it's good to know, thanks – Jay Sep 22 '17 at 14:09
  • 1
    http://docs.sqlalchemy.org/en/rel_1_0/orm/persistence_techniques.html#bulk-operations - if you're interested. – brddawg Sep 22 '17 at 14:16
  • That might be relevan, thanks. I still need a JDBC connection though, and bulk won't work through Jpype.. – Jay Sep 22 '17 at 16:46
0

In py4j, with your respective JDBC uri:

from py4j.java_gateway import JavaGateway

# Open JVM interface with the JDBC Jar
jdbc_jar_path = '/path/to/jdbc_driver.jar'
gateway = JavaGateway.launch_gateway(classpath=jdbc_jar_path) 

# Load the JDBC Jar
jdbc_class = "com.vendor.VendorJDBC"
gateway.jvm.class.forName(jdbc_class)

# Initiate connection
jdbc_uri = "jdbc://vendor:192.168.x.y:zzzz;..."
con =  gateway.jvm.DriverManager.getConnection(jdbc_uri)

# Run a query
sql = "select this from that"
stmt = con.createStatement(sql)
rs = stmt.executeQuery()
while rs.next():
    rs.getInt(1)
    rs.getFloat(2)
    .
    .
rs.close()
stmt.close()
Jackson
  • 31
  • 6
Jay
  • 2,535
  • 3
  • 32
  • 44
0

I have met a similar problem in airflow, I used teradata jdbc jars and jaydebeapi to connect teradata database and execute sql:

[root@myhost transfer]# cat test_conn.py 
import jaydebeapi
from contextlib import closing


jclassname='com.teradata.jdbc.TeraDriver'
jdbc_driver_loc = '/opt/spark-2.3.1/jars/terajdbc4-16.20.00.06.jar,/opt/spark-2.3.1/jars/tdgssconfig-16.20.00.06.jar'
jdbc_driver_name = 'com.teradata.jdbc.TeraDriver'
host='my_teradata.address'

url='jdbc:teradata://' + host + '/TMODE=TERA'
login="teradata_user_name"
psw="teradata_passwd"

sql = "SELECT COUNT(*) FROM  A_TERADATA_TABLE_NAME where month_key='202009'"


conn = jaydebeapi.connect(jclassname=jdbc_driver_name,
                                  url=url, 
                                  driver_args=[login, psw],
                                  jars=jdbc_driver_loc.split(","))

with closing(conn) as conn:
    with closing(conn.cursor()) as cur:
        cur.execute(sql)
        print(cur.fetchall())
   
[root@myhost transfer]# python test_conn.py
[(7734133,)]
[root@myhost transfer]# 
kain
  • 214
  • 2
  • 4