1

How I can create an Athena data source in AWS CDK which is a JDBC connection to a MySQL database using the AthenaJdbcConnector?

I believe I can use aws-sam's CfnApplication to create the AthenaJdbcConnector Lambda, but how can I connect it to Athena?

I notice a lot of Glue support in CDK which would transfer to Athena (data catalog), and there are several CfnDataSource types in other modules such as QuickSight, but I'm not seeing anything under Athena in CDK.

See the image and references below.

References:

CosmicVarion
  • 116
  • 9

2 Answers2

4

I have been playing with the same issue. Here is what I did to create the Lambda for federated queries (Typescript):

    const vpc = ec2.Vpc.fromLookup(this, 'my-project-vpc', {
        vpcId: props.vpcId
    });

    const cluster = new rds.ServerlessCluster(this, 'AuroraCluster', {
        engine: rds.DatabaseClusterEngine.AURORA_POSTGRESQL,
        parameterGroup: rds.ParameterGroup.fromParameterGroupName(this, 'ParameterGroup', 'default.aurora-postgresql10'),
        defaultDatabaseName: 'MyDB',
        vpc,
        vpcSubnets: {
            onePerAz: true
        },
        scaling: {autoPause: cdk.Duration.seconds(0)} // Optional. If not set, then instance will pause after 5 minutes
    });

    let password = cluster.secret!.secretValueFromJson('password').toString()

    let spillBucket = new Bucket(this, "AthenaFederatedSpill")


    let lambdaApp = new CfnApplication(this, "MyDB", {
        location: {
            applicationId: "arn:aws:serverlessrepo:us-east-1:292517598671:applications/AthenaJdbcConnector",
            semanticVersion: "2021.42.1"
        },
        parameters: {
            DefaultConnectionString: `postgres://jdbc:postgresql://${cluster.clusterEndpoint.hostname}/MyDB?user=postgres&password=${password}`,
            LambdaFunctionName: "crossref_federation",
            SecretNamePrefix: `${cluster.secret?.secretName}`,
            SecurityGroupIds: `${cluster.connections.securityGroups.map(value => value.securityGroupId).join(",")}`,
            SpillBucket: spillBucket.bucketName,
            SubnetIds: vpc.privateSubnets[0].subnetId
        }
    })

This creates the lambda with a default connection string like you would have it, if you used the AWS Console wizard in Athena to connect to a DataSource. Unfortunately it is NOT possible to add a Athena-catalog specific connection string via CDK. It should be set as an Environment Variable on the Lambda, and I found no way to do that. The Application template simply don't allow it, so this is a post-process by hand. I would sure like to hear from anybody if they have a solution for that!

Also notice that I add the user/password in the jdbc URL directly. I wanted to use SecretsManager, but because the Lambda is deployed in a VPC, it simply refuses to connect to the secretsmanager. I think this might be solvable by added a private VPN connection to SSM. Again - I would like to hear from anybody have tried that.

  • 2
    Awesome. I figured out `CfnApplication`, too. You can use [getFunction-property](https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Lambda.html#getFunction-property) and then add your variable and update it with [updateFunctionConfiguration-property](https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Lambda.html#updateFunctionConfiguration-property) to add the specific connection string after deployment. – CosmicVarion Nov 16 '21 at 14:08
  • "it simply refuses to connect to the secretsmanager" - you need to add a VPC Interface Endpoint for Secrets Manager to your VPC – Flops Mar 18 '22 at 23:35
  • hey any one found any workarounds for this issue, thanks!! – balajivaishnav Aug 04 '22 at 06:29
0

DO NOT COPY PASTE THE TOP SOLUTION. It hardcodes a database password as plaintext cloud formation.

As the OP hinted at: the correct solution is to create an interface endpoint for SecretsManager in the VPC you deploy the connector into. You must also have DNS enabled in the VPC. This is reflected in the AWS documentation on Athena data source connectors.

To use the Athena Federated Query feature with AWS Secrets Manager, you must configure an Amazon VPC private endpoint for Secrets Manager. For more information, see Create a Secrets Manager VPC private endpoint in the AWS Secrets Manager User Guide.

Eric Aya
  • 69,473
  • 35
  • 181
  • 253
GusBuonv
  • 1
  • 1