3

I'm using mongo source to listen to mongo change stream and put all events into kafka, but I'm strangling to find a way to extract the "Real" key from the event. I tried transformation, but it didn't work, giving me error:

Caused by: org.apache.kafka.connect.errors.DataException: Only Struct objects supported for [copying fields from value to key], found: java.lang.String

in Mongo source I found this line

which basically implies it doesn't even have some key handling, instead, it looks for the "_id" field (which is not the id of the document, it's a resume token info)

instead I would like to set the key for the topic to be "documentKey".

here is an example of the events the connector get:

{
 "_id": {
    "_data": "DSAD45543FFWEHTEY004....."
  },
  "operationType": "replace",
  "clusterTime": {
    "$timestamp": {
      "t": 1446707990,
      "i": 1
    }
  },
  "fullDocument": {
    "_id": {
      "$binary": "FxVFgHFRhrr/z+zUc/w==",
      "$type": "03"
    },
    ...
  },
  "ns": {
    "db": "somedb",
    "coll": "somecol"
  },
  "documentKey": {
    "_id": {
      "$binary": "FxVFgHFRhrr/z+zUc/w==",
      "$type": "03"
    }
  }
}

I used the following configuration:

"transforms":"createKey",
"transforms.createKey.type":"org.apache.kafka.connect.transforms.ValueToKey",
"transforms.createKey.fields":"documentKey"

I tried it with:

org.apache.kafka.connect.json.JsonConverter

and also with StringConverter (Although I don't think this can be done with string)

org.apache.kafka.connect.storage.StringConverter

Is there any way to extract the key? Please note: schema is disabled.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Winter Ha
  • 31
  • 2

2 Answers2

1

This is because MongoDB Source Connector for Kafka does not support it yet. It should support advanced Key selection from release 1.3 onward.

https://jira.mongodb.org/browse/KAFKA-40

Hamid
  • 717
  • 7
  • 15
0

Please note: schema is disabled

In that case, you cannot use the ValueToKey transform. Even if you could, though, that transform does not support nested values within the payload, which in your case would be something like documentKey._id.$binary

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245