0

I have a dataframe that I need to write to Kafka.

I have the avro schema defined, similar to this:

{
    "namespace": "my.name.space",
    "type": "record",
    "name": "MyClass",
    "fields": [
       {"name": "id", "type": "string"},
       {"name": "parameter1", "type": "string"},
       {"name": "parameter2", "type": "string"},
       ...
     ]
}

and it's auto-generated to java bean. It's something similar to this:

public class MyClass extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord {
  String id;
  String parameter1;
  String parameter2;
  ...
}

I found that to write in avro format there is only to_avro method that takes a column.

So my question, is there a way to force writing to Kafka in Avro format in this defined schema?

Mahmoud Hanafy
  • 1,861
  • 3
  • 24
  • 33

1 Answers1

0

You can only do this when using Confluent. See https://aseigneurin.github.io/2018/08/02/kafka-tutorial-4-avro-and-schema-registry.html

thebluephantom
  • 16,458
  • 8
  • 40
  • 83
  • How this is possible with Spark? or you have to write the data manually using foreachPartition? – Mahmoud Hanafy Jun 15 '20 at 11:22
  • I have never tried it, but when I talked to a colleague it was convoluted. He told me they did not have Confluent there due to these type of complications. – thebluephantom Jun 15 '20 at 11:51