1

Is it possible to remove a nested field using SMT with Kafka connect ?

I know the following works perfectly :

"transforms": "ReplaceField",
"transforms.ReplaceField.type": "org.apache.kafka.connect.transforms.ReplaceField$Value",
"transforms.ReplaceField.blacklist": "FieldFoo"

But this does not work ( assuming the nested field is foo->bar) :

"transforms": "ReplaceField",
"transforms.ReplaceField.type": "org.apache.kafka.connect.transforms.ReplaceField$Value",
"transforms.ReplaceField.blacklist": "FieldFoo.NestedFieldBar",

My data is in avro format.

I don't want to modify the data itself ( like flattening everything) to be able to do that. Any way ?

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Yannick
  • 1,240
  • 2
  • 13
  • 25

2 Answers2

3

All (or at least most) of the Kafka Connect Transforms only work on top-level fields via .get and .put calls to a Struct or Map<String, ?>

You can inspect the source here -

https://github.com/apache/kafka/blob/2.3/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ReplaceField.java#L150-L163

I would suggest searching/opening a JIRA (and KIP) because it's a long-outstanding issue, IMO

The issue, though, would be that "FieldFoo.NestedFieldBar" is a valid String in itself, so it's hard to differentiate that between the following objects without extra characters like back-ticks or KSQL-like approach of FieldFoo->NestedFieldBar

"FieldFoo.NestedFieldBar": "value" 

and

"FieldFoo" : { 
  "NestedFieldBar": "value"
}
OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
0

You have to use "flatten" transformer first to include a "." (period) within the nested fields. So in your case the below should work.

"transforms": "flatten,ReplaceField",
"transforms.flatten.type": "org.apache.kafka.connect.transforms.Flatten$Value",
"transforms.flatten.delimiter": "."
"transforms.ReplaceField.type": "org.apache.kafka.connect.transforms.ReplaceField$Value",
"transforms.ReplaceField.blacklist": "FieldFoo.NestedFieldBar"
Mahaveer Jangir
  • 597
  • 7
  • 15
  • The above solution is not working when you have any Arraylist on your data. Do you know how to tackle the Arraylist flatten and black list the Arraylist? – Vishal Dhanani Apr 06 '22 at 15:03