0

I am trying to create an AWS Glue Streaming job that reads from Kafka (MSK) clusters using SASL/SCRAM client authentication for the connection, per https://aws.amazon.com/about-aws/whats-new/2022/05/aws-glue-supports-sasl-authentication-apache-kafka/

The connection configuration has the following properties (plus adequate subnet and security groups):

"ConnectionProperties": {
            "KAFKA_SASL_SCRAM_PASSWORD": "apassword",
            "KAFKA_BOOTSTRAP_SERVERS": "theserver:9096",
            "KAFKA_SASL_MECHANISM": "SCRAM-SHA-512",
            "KAFKA_SASL_SCRAM_USERNAME": "auser",
            "KAFKA_SSL_ENABLED": "false"
        }

And the actual api method call is

df = glue_context.create_data_frame.from_options(
        connection_type="kafka",
        connection_options={
            "connectionName": "kafka-glue-connector",
            "security.protocol": "SASL_SSL",
            "classification": "json",
            "startingOffsets": "latest",
            "topicName": "atopic",
            "inferSchema": "true",
            "typeOfData": "kafka",
            "numRetries": 1,
        }
)

When running logs show the client is attempting to connect to brokers using Kerberos, and runs into

22/10/19 18:45:54 INFO ConsumerConfig: ConsumerConfig values: 
    sasl.mechanism = GSSAPI
    security.protocol = SASL_SSL
    security.providers = null
    send.buffer.bytes = 131072
    ...

org.apache.kafka.common.errors.SaslAuthenticationException: Failed to configure SaslClientAuthenticator
Caused by: org.apache.kafka.common.KafkaException: Principal could not be determined from Subject, this may be a transient failure due to Kerberos re-login

How can I authenticate the AWS Glue job using SASL/SCRAM? What properties do I need to set in the connection and in the method call?

Thank you

0 Answers0