0

I am running a AWS Glue job to execute stored procedures in an Oracle database. I want to be able to catch the sql exception when a stored procedure fails. I am using 'from py4j.java_gateway import java_import' to set up the connection and execute SQL commands on the connection.

import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
import boto3

############################################
### UPDATE THE STORE PROCEDURE NAME HERE ###

sp_names = [
'DEV_WF_POC1',
'DEV_WF_POC2'
        ]
############################################

#Set the conection name (Will be replaced for FT and PROD by powershell deployment script)
glue_connection_name = 'dw-dev-connection'

#Use systems args to return job name and pass to local variable
args = getResolvedOptions(sys.argv, ['JOB_NAME','WORKFLOW_NAME', 'WORKFLOW_RUN_ID'])
workflow_name = args['WORKFLOW_NAME']
workflow_run_id = args['WORKFLOW_RUN_ID']
glue_job_name = args['JOB_NAME']

#Create spark handler and update status of glue job     
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(glue_job_name, args)
job.commit()
logger = glueContext.get_logger()
glue_client = boto3.client('glue')


#Extract connection details from Data Catelog     
source_jdbc_conf = glueContext.extract_jdbc_conf(glue_connection_name)

#Import Python for Java Java.sql libs    
from py4j.java_gateway import java_import
java_import(sc._gateway.jvm,"java.sql.Connection")
java_import(sc._gateway.jvm,"java.sql.DatabaseMetaData")
java_import(sc._gateway.jvm,"java.sql.DriverManager")
java_import(sc._gateway.jvm,"java.sql.SQLException")

#Extract the URL from the JDBC connection 
oracleurl = source_jdbc_conf.get('url')

# Update connection string to expected Oracle format
oracleurl = oracleurl.replace("oracle://", "oracle:thin:@")
oracleurl = oracleurl + ':orcl'

#Create the connection to the Oracle database with java.sql
conn = sc._gateway.jvm.DriverManager.getConnection(oracleurl, source_jdbc_conf.get('user'),      source_jdbc_conf.get('password'))

#Change autocommit to false to avoid Table lock error 
conn.setAutoCommit(False);

# error dict
errs = {}
err = ''
try:

    for sp_name in sp_names:

    #Prepare call stored procedure statement and execute
    cstmt = conn.prepareCall("{call reporting." + sp_name + "}");
    results = cstmt.execute();
    
    conn.commit();

# capture error

except Exception as e: # work on python 3.x
    ##errs['msg'] = str(sc._gateway.jvm.SQLException.getMessage())- doesn't work 
    errs['error'] = str(e)
    errs['sp_name'] = sp_name
    errs['error_type'] = str(type(e)).replace("<class '","").replace("'>","")

if len(errs) != 0:
    stmt = conn.createStatement();
    sql = "insert into dev_workflow_errors (timestamp, workflow_id, workflow_name, job_name,  sp_name, error_type, error) values (current_timestamp, '" + workflow_run_id + "', '" + workflow_name + "', '" + glue_job_name + "', '" + errs['sp_name'] + "', '" + errs['error_type'] + "', '" + errs['msg'] + "')"
    rs = stmt.executeUpdate(sql);
    conn.commit();
    #sys.exit(1)

#Close down the connection
conn.close();

#Update Logger
logger.info("Finished")

I tried the pythonic 'Try' and 'Except' method but for the base exception I just get the full 'py4j.protocol.Py4JJavaError' error message. Inside this message contains the database specific error messages I want to extract.

Can I use

java_import(sc._gateway.jvm,"java.sql.SQLException")

in any way to extract database specific errors from the execute function?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • 1
    Does python-oracledb install cleanly in AWS Glue? If so, then you could use this API instead of the heavy-weight 'call java from python' approach. With python-oracledb you won't need Oracle Instant Client but you will need python's cryptography package. (Technically you need one or the other: either Oracle Client or the cryptography package. The default install just uses the latter.) – Christopher Jones Nov 16 '22 at 21:20
  • Unfortunately, oracledb is not supported in AWS glue. – Marcus Trinder Nov 18 '22 at 14:28
  • What definition of 'support' is that? Do you mean not available, or doesn't run or...? I know you can run the older version (cx_Oracle), see https://stackoverflow.com/questions/70972408/unable-to-connect-oracle-database-using-cx-oracle-from-aws-glue – Christopher Jones Nov 18 '22 at 21:01

0 Answers0