What is the shortest and safest way to make all fields in Avro schema nullable?
Of course, I can work with a schema's Json and just do like schema.toString().replaceAll("\"type\": \"long\"", "\"type\": [\"null\", \"long\"]")
, but it's quite ugly and unsafe solution.
Asked
Active
Viewed 1,335 times
0

mrkv
- 82
- 1
- 14
1 Answers
0
The below code will add default values for union types and also swap types with null type first. You can add another condition for primitive types and add union type along with 'null' default value.
import org.apache.avro.JsonProperties;
import org.apache.avro.Schema;
...
String srcSchemaFile = "sample.avsc"; // Source Avro schema file
String targetSchemaFile = "sample_fixed.avsc"; // Target Avro schema file
Schema.Parser avroParser = new Schema.Parser();
Schema schema = avroParser.parse(new File(srcSchemaFile));
makeNullable(schema);
PrintWriter writer = new PrintWriter(targetSchemaFile);
writer.write(schema.toString().replaceAll("defaultXXX", "default"));
writer.close();
...
private static void makeNullable(Schema schema){
if ( schema.getType() != Schema.Type.NULL){
for ( Schema.Field field: schema.getFields()){
if (field.schema().getType() == Schema.Type.UNION){
int nullIndex = IntStream.range(0, field.schema().getTypes().size())
.filter(i -> field.schema().getTypes().get(i).getType() == Schema.Type.NULL )
.findFirst().orElse(-1);
if (nullIndex > 0 && field.defaultVal() == null){
// default property is reserved and cannot be added through addProp method, adding defaultXXX to replace later as a workaround
field.addProp("defaultXXX", JsonProperties.NULL_VALUE);
Collections.swap(field.schema().getTypes(), 0, nullIndex);
}
for (Schema fieldSchema: field.schema().getTypes()){
if (fieldSchema.getType() == Schema.Type.RECORD){
makeNullable(fieldSchema);
} else if (fieldSchema.getType() == Schema.Type.ARRAY){
for (Schema elemSchema: fieldSchema.getElementType().getTypes()){
if (elemSchema.getType() == Schema.Type.RECORD){
makeNullable(elemSchema);
}
}
}
}
}
}
}
}

Bahodir Nigmat
- 66
- 4