We were trying to use the spark-redshift project, following the 3rd recommendation for providing the credentials. Namely:
IAM instance profiles: If you are running on EC2 and authenticate to S3 using IAM and instance profiles, then you must must configure the temporary_aws_access_key_id, temporary_aws_secret_access_key, and temporary_aws_session_token configuration properties to point to temporary keys created via the AWS Security Token Service. These temporary keys will then be passed to Redshift via LOAD and UNLOAD commands.
Our Spark application is running from an EMR cluster. For such purpose, we tried to obtain temporary credentials from inside instances of this node calling getSessionToken
like this:
val stsClient = new AWSSecurityTokenServiceClient(new InstanceProfileCredentialsProvider())
val getSessionTokenRequest = new GetSessionTokenRequest()
val sessionTokenResult = stsClient.getSessionToken(getSessionTokenRequest);
val sessionCredentials = sessionTokenResult.getCredentials()
But this throws 403 Access Denied
, even if the policy with sts:getSessionToken
is applied to the role of the instances of EMR.
Then we tried the following two alternatives. First, using the AssumeRole
policy:
val p = new STSAssumeRoleSessionCredentialsProvider("arn:aws:iam::123456798123:role/My_EMR_Role", "session_name")
val credentials: AWSSessionCredentials = p.getCredentials
val token = credentials.getSessionToken
and second, casting the result from InstanceProfileCredentialsProvider
:
val provider = new InstanceProfileCredentialsProvider()
val credentials: AWSSessionCredentials = provider.getCredentials.asInstanceOf[AWSSessionCredentials]
val token = credentials.getSessionToken
They both work, but which is the expected way of doing this? Is there something terribly wrong about casting the result or adding the AssumeRole
policy?
Thanks!