I need to convert following to Spark DataFrame in Java with the saving of the structure according to the avro schema. And then I'm going to write it to s3 based on this avro structure.
GenericRecord r = new GenericData.Record(inAvroSchema);
r.put("id", "1");
r.put("cnt", 111);
Schema enumTest =
SchemaBuilder.enumeration("name1")
.namespace("com.name")
.symbols("s1", "s2");
GenericData.EnumSymbol symbol = new GenericData.EnumSymbol(enumTest, "s1");
r.put("type", symbol);
ByteArrayOutputStream bao = new ByteArrayOutputStream();
GenericDatumWriter<GenericRecord> w = new GenericDatumWriter<>(inAvroSchema);
Encoder e = EncoderFactory.get().jsonEncoder(inAvroSchema, bao);
w.write(r, e);
e.flush();
I can create the object based on JSON structure
Object o = reader.read(null, DecoderFactory.get().jsonDecoder(inAvroSchema, new ByteArrayInputStream(bao.toByteArray())));
But maybe there is any way to create DataFrame based on ByteArrayInputStream(bao.toByteArray())?
Thanks