1

I have a connection from AWS Glue to Oracle R12 and it seems to work fine when I test it in the "connections" section of AWS Glue:

p-*-oracleconnection connected successfully to your instance.

I can crawl all the tables etc. and get the whole schema without a problem.

However as soon as I try to use these crawled tables in a Glue Job I get this:

py4j.protocol.Py4JJavaError: An error occurred while calling o64.getDynamicFrame.
: java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection

Connection String (Sanitised obviously)

jdbc:oracle:thin://@xxx.xxx.xxx.xxx:1000:FOOBAR

Loading into DynamicFrame

sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args["JOB_NAME"], args)

datasource0 = glueContext.create_dynamic_frame.from_catalog(
    database=args['INPUT_DATABASE'],
    table_name=args['INPUT_TABLE_NAME'],
    transformation_ctx="datasource0",
)

where the Glue job arguments are:

--INPUT_DATABASE p-*-source-database
--INPUT_TABLE_NAME foobar_xx_xx_animals

Which I have validated and both exist in AWS Glue

Reasons I have to stay using Spark on Glue:

  • Job Bookmark

Reasons I have to use Glues built in connections and not direct from Spark:

  • VPC is needed

I just don't understand why I can crawl all the tables and get all the metadata but as soon as I try to load this into a DynamicFrame it errors out...

ck3mp
  • 391
  • 5
  • 18

1 Answers1

0

I encountered the same Issue, but instead of a Oracle Database, I had to use a Azure SQL Database. The way I solved this is by first writing the Data (using the connection) to S3. Afterwards I crawled the S3 Data into a new Glue Catalog Table. Lastly, I transferred the S3 Data to the RDS instance.

While doing research on the problem I figured out that Glue struggles with 2 connections, I assume if you try to write from one Database instance to another there are 2 connections build up in the background. I have no idea if that’s the reason (since I am new to AWS), or how that would lead to a timeout error, but here we are. Hopefully this helps someone.

Sholly
  • 23
  • 5