1

I am looking for an example to do Bijection on Avro SpecificRecordBase object similar to a GenericRecordBase or if there is a simpler way to use the AvroSerializer class as the Kafka key and value serializer.

Injection<GenericRecord, byte[]> genericRecordInjection =
                                        GenericAvroCodecs.toBinary(schema);
byte[] bytes = genericRecordInjection.apply(type);
bdparrish
  • 3,216
  • 3
  • 37
  • 58

2 Answers2

1

https://github.com/miguno/kafka-storm-starter provides such example code.

See, for instance, AvroDecoderBolt. From its javadocs:

This bolt expects incoming data in Avro-encoded binary format, serialized according to the Avro schema of T. It will deserialize the incoming data into a T pojo, and emit this pojo to downstream consumers. As such this bolt can be considered the Storm equivalent of Twitter Bijection's Injection.invert[T, Array[Byte]](bytes) for Avro data.

where

T: The type of the Avro record (e.g. a Tweet) based on the underlying Avro schema being used. Must be a subclass of Avro's SpecificRecordBase.

The key part of the code is (I collapsed the code into this snippet):

// With T <: SpecificRecordBase

implicit val specificAvroBinaryInjection: Injection[T, Array[Byte]] =
SpecificAvroCodecs.toBinary[T]

val bytes: Array[Byte] = ...; // the Avro-encoded data
val decodeTry: Try[T] = Injection.invert(bytes)
decodeTry match {
  case Success(pojo) =>
    System.out.println("Binary data decoded into pojo: " + pojo)
  case Failure(e) => log.error("Could not decode binary data: " + Throwables.getStackTraceAsString(e))
}
miguno
  • 14,498
  • 3
  • 47
  • 63
0
Schema.Parser parser = new Schema.Parser();
            Schema schema = parser.parse(new File("/Users/.../schema.avsc"));
            Injection<Command, byte[]> objectInjection = SpecificAvroCodecs.toBinary(schema);
            byte[] bytes = objectInjection.apply(c);
non sequitor
  • 18,296
  • 9
  • 45
  • 64
  • Am I correct to assume that the schema will still be part of the object itself here? As there is a method .getSchema() available on the (generic)record. This to me seems to defeat the whole purpose of having a separate schema – Havnar Aug 01 '18 at 12:55