How do you write to Kafka using the ABRiS library in PySpark?

Question

Has anyone been able to write to Kafka using this library using PySpark?

I've been able to successfully read using the code from the README documentation:

import logging, traceback
import requests
from pyspark.sql import Column
from pyspark.sql.column import *

jvm_gateway = spark_context._gateway.jvm
abris_avro  = jvm_gateway.za.co.absa.abris.avro
naming_strategy = getattr(getattr(abris_avro.read.confluent.SchemaManager, "SchemaStorageNamingStrategies$"), "MODULE$").TOPIC_NAME()        

schema_registry_config_dict = {"schema.registry.url": schema_registry_url,
                               "schema.registry.topic": topic,
                               "value.schema.id": "latest",
                               "value.schema.naming.strategy": naming_strategy}

conf_map = getattr(getattr(jvm_gateway.scala.collection.immutable.Map, "EmptyMap$"), "MODULE$")
    for k, v in schema_registry_config_dict.items():
        conf_map = getattr(conf_map, "$plus")(jvm_gateway.scala.Tuple2(k, v))
        
    deserialized_df = data_frame.select(Column(abris_avro.functions.from_confluent_avro(data_frame._jdf.col("value"), conf_map))
                      .alias("data")).select("data.*")

However, I am struggling to extend the behaviour by writing to topics via the to_confluent_avro function.

How do you write to Kafka using the ABRiS library in PySpark?

0 Answers0