6

I am new to Kafka and I want to see if I can sync MongoDb data with another system using Kafka.

My set up:

  1. I am running AWS MSK Cluster and I have created an EC2 instance with Kafka client manually.
  2. I have added MongoDB Kafka Connect Plugin to /usr/local/share/kafka/plugins.
  3. I am running Kafka connect and can see that it loads the plugin
./bin/connect-standalone.sh ./config/connect-standalone.properties /usr/local/share/kafka/plugins/mongodb-kafka-connect-mongodb-1.3.0/etc/MongoSourceConnector.properties
[2020-10-17 13:57:22,304] INFO Registered loader: PluginClassLoader{pluginLocation=file:/usr/local/share/kafka/plugins/mongodb-kafka-connect-mongodb-1.3.0/} (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:264)
[2020-10-17 13:57:22,305] INFO Added plugin 'com.mongodb.kafka.connect.MongoSourceConnector' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:193)
[2020-10-17 13:57:22,305] INFO Added plugin 'com.mongodb.kafka.connect.MongoSinkConnector' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:193)
  1. Unpacked plugin has this structure
Archive:  mongodb-kafka-connect-mongodb-1.3.0.zip
   creating: /usr/local/share/kafka/plugins/mongodb-kafka-connect-mongodb-1.3.0/
   creating: /usr/local/share/kafka/plugins/mongodb-kafka-connect-mongodb-1.3.0/etc/
  inflating: /usr/local/share/kafka/plugins/mongodb-kafka-connect-mongodb-1.3.0/etc/MongoSourceConnector.properties  
  inflating: /usr/local/share/kafka/plugins/mongodb-kafka-connect-mongodb-1.3.0/etc/MongoSinkConnector.properties  
   creating: /usr/local/share/kafka/plugins/mongodb-kafka-connect-mongodb-1.3.0/doc/
  inflating: /usr/local/share/kafka/plugins/mongodb-kafka-connect-mongodb-1.3.0/doc/README.md  
  inflating: /usr/local/share/kafka/plugins/mongodb-kafka-connect-mongodb-1.3.0/doc/LICENSE.txt  
  inflating: /usr/local/share/kafka/plugins/mongodb-kafka-connect-mongodb-1.3.0/manifest.json  
   creating: /usr/local/share/kafka/plugins/mongodb-kafka-connect-mongodb-1.3.0/lib/
  inflating: /usr/local/share/kafka/plugins/mongodb-kafka-connect-mongodb-1.3.0/lib/mongo-kafka-1.3.0-all.jar  
   creating: /usr/local/share/kafka/plugins/mongodb-kafka-connect-mongodb-1.3.0/assets/
  inflating: /usr/local/share/kafka/plugins/mongodb-kafka-connect-mongodb-1.3.0/assets/mongodb-leaf.png  
  inflating: /usr/local/share/kafka/plugins/mongodb-kafka-connect-mongodb-1.3.0/assets/mongodb-logo.png  

This plugin is from confluent page, I have also tried downloading it from Maven page. The problem is when I run Kafka Connect it fails because plugin is missing a Java dependency.

[2020-10-17 13:57:24,898] ERROR Stopping after connector error (org.apache.kafka.connect.cli.ConnectStandalone:121)
java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: org/apache/avro/Schema
    at com.mongodb.kafka.connect.source.MongoSourceConfig.createConfigDef(MongoSourceConfig.java:591)
    at com.mongodb.kafka.connect.source.MongoSourceConfig.<clinit>(MongoSourceConfig.java:293)
    at com.mongodb.kafka.connect.MongoSourceConnector.config(MongoSourceConnector.java:91)
    at org.apache.kafka.connect.connector.Connector.validate(Connector.java:129)
    at com.mongodb.kafka.connect.MongoSourceConnector.validate(MongoSourceConnector.java:51)
    at org.apache.kafka.connect.runtime.AbstractHerder.validateConnectorConfig(AbstractHerder.java:313)
    at org.apache.kafka.connect.runtime.standalone.StandaloneHerder.putConnectorConfig(StandaloneHerder.java:192)
    at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:115)
Caused by: java.lang.NoClassDefFoundError: org/apache/avro/Schema
    ... 8 more
Caused by: java.lang.ClassNotFoundException: org.apache.avro.Schema
    at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
    at org.apache.kafka.connect.runtime.isolation.PluginClassLoader.loadClass(PluginClassLoader.java:104)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
    ... 8 more

My impression was that plugin should look for dependencies in the jar file /usr/local/share/kafka/plugins/mongodb-kafka-connect-mongodb-1.3.0/lib/mongo-kafka-1.3.0-all.jar not in the Java SDK.

What am I missing in this set up?

Semant1ka
  • 647
  • 10
  • 26
  • 2
    I just had the same problem and was surprised finding such a new question here. It worked for me, after executing `bin/connect-distributed.sh config/connect-distributed.properties`.The `plugin.path` has to be changed in `connect-distributed.properties` as well, obviously. I don't know if this is a good solution, though. – Simon K. Oct 17 '20 at 18:04
  • Thank you so much for your reply @SimonK. This works indeed! They recommend to run distributed kafka connect configuration in production, so this is probably not a huge issue. Do you think it is worth filing a bug to mongo kafka plugin developers? It looks like one to me. – Semant1ka Oct 17 '20 at 20:22
  • @SimonK. Apparently, even though Kafka Connect seem to be starting in distributed mode and loading the connector, Kafka connect rest api returns empty list when I am trying to list connectors. When I am trying to create one it results in the same error. In you set up are you able to create a connector? – Semant1ka Oct 18 '20 at 19:02
  • I experienced a similar problem. It was quite frustrating, so I switched to the previous version of MongoDB Connect 1.2.0. Distributed mode is not working in this constellation as well though. I am a total newb to this kind of technology, so I cannot say for sure if I am totally wrong or this is really a big. – Simon K. Oct 19 '20 at 06:04
  • @SimonK. Oh, well. I have created a ticket to Mongo Support, I'll see what they will respond and post it in here. – Semant1ka Oct 19 '20 at 20:38
  • @SimonK. See answer – OneCricketeer Oct 20 '20 at 07:45

2 Answers2

5

A quick look at this should tell you if the error is correct...

jar -tf  mongo-kafka-1.3.0-all.jar | grep avro

If that JAR doesn't bundle Avro itself, then MSK very likely doesn't include Avro like Confluent Platform does (which I assume Mongo bundled their connector primarily for). At least, Avro is not a dependency of Apache Kafka, so that would explain that error.

You will need to download the Avro JAR and place it on your Kafka Connect Classpath (or at least in that lib folder)

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • Thank you for this analysis. As far as I read the documentation, one does not have to add custom jars to get the plugin running in a non-Confluent environment. Would be interesting , whether the aveo jar was present in former releases. – Simon K. Oct 20 '20 at 12:52
  • It is mentioned in the Gradle file, but that doesn't necessarily mean that they package it in the build output https://github.com/mongodb/mongo-kafka/blob/master/build.gradle.kts#L78. See they exclude everything not bson or Mongo https://github.com/mongodb/mongo-kafka/blob/master/build.gradle.kts#L239 – OneCricketeer Oct 20 '20 at 13:37
  • Thank you for this comment, somehow it didn't come to my mind to unpack the jar and actually search for the library. Yes, it looks like this library is not bundled in this jar but as @SimonK. noted Mongo how-to page makes it look like it should. Also Simon runs earlier version of the plugin that works, so maybe this is actually a bug in how they bundle this. AWS MSK runs bare bones Apache Kafka so this dependency definitely doesn't exists in there. – Semant1ka Oct 20 '20 at 18:17
  • 2
    Alright, I have downloaded Avro library and its dependencies from `wget https://download.jar-download.com/cache_jars/org.apache.avro/avro/1.10.0/jar_files.zip` and got the standalone version up and running (actually receiving Mongo events from change stream). Depending on what Mongo support will tell me I will either mark this as an answer or wait until they create a proper bundle/documentation and post it in here as an answer. – Semant1ka Oct 20 '20 at 18:38
  • 1
    Worth pointing out that version 1.9.2 is used by the connector (based on link above). Also better to get from official sources https://search.maven.org/classic/#search%7Cgav%7C1%7Cg%3A%22org.apache.avro%22%20AND%20a%3A%22avro%22 – OneCricketeer Oct 20 '20 at 18:46
1

I faced same issue, when running on my local. I downloaded the jar(mongo-kafka-connect-1.6.0-confluent.jar) from confluent platform which does not provide uber jar anymore. So I searched for uber jar and found below site from where, I could download uber jar(select all in Download dropdown) and that resolved the issue.

https://search.maven.org/search?q=a:mongo-kafka-connect

Hamid
  • 717
  • 7
  • 15