0

I have a glue job that reads directly from redshift, and to do that, one has to provide connection credentials. I have created an embedded glue connection and can extract the credentials with the following pyspark code. Is there a way to do this in Scala?

glue = boto3.client('glue', region_name='us-east-1')
    
response = glue.get_connection(
    Name='name-of-embedded-connection',
    HidePassword=False 
)
    
table = spark.read.format(
    'com.databricks.spark.redshift'
).option(
    'url',
    'jdbc:redshift://prod.us-east-1.redshift.amazonaws.com:5439/db'
).option(
    'user',
    response['Connection']['ConnectionProperties']['USERNAME']
).option(
    'password',
    response['Connection']['ConnectionProperties']['PASSWORD']
).option(
    'dbtable',
    'db.table'
).option(
    'tempdir',
    's3://config/glue/temp/redshift/'
).option(
    'forward_spark_s3_credentials', 'true'
).load()
jayrythium
  • 679
  • 4
  • 11

1 Answers1

0

There is no scala equivalent from AWS to issue this API call.But you can use Java SDK code inside scala as mentioned in this answer.

This is the Java SDK call for getConnection and if you don't want to do this then you can follow below approach:

  1. Create AWS Glue python shell job and retrieve the connection information.

  2. Once you have the values then call the other scala Glue job with these as arguments inside your python shell job as shown below :

glue = boto3.client('glue', region_name='us-east-1')

response = glue.get_connection(
    Name='name-of-embedded-connection',
    HidePassword=False 
)

response = client.start_job_run(
               JobName = 'my_scala_Job',
               Arguments = {
                 '--username': response['Connection']['ConnectionProperties']['USERNAME'],
                 '--password': response['Connection']['ConnectionProperties']['PASSWORD'] } )
  1. Then access these parameters inside your scala job using getResolvedOptions as shown below:

import com.amazonaws.services.glue.util.GlueArgParser

val args = GlueArgParser.getResolvedOptions(
  sysArgs, Array(
    "username",
    "password")
)
val user = args("username")
val pwd  = args("password")
Prabhakar Reddy
  • 4,628
  • 18
  • 36
  • Thank you for your prompt response . I apologize, I have no experience with Java. How would you run Java SDK `getConnection` call? – jayrythium Aug 13 '20 at 13:58
  • Even I don't know how to do that and that's why I gave you the other approach. – Prabhakar Reddy Aug 13 '20 at 14:05
  • ah ok, i appreciate it. – jayrythium Aug 13 '20 at 14:15
  • I was able to concoct something but I'm still unable to get it in the right format `com.amazonaws.services.glue.model.GetConnectionRequest`. I have created a separate ticket [here](https://stackoverflow.com/questions/63401202/extract-aws-glue-credentials-from-created-glue-client-scala) – jayrythium Aug 13 '20 at 18:35