0

According to the tutorial I'm following you just execute script.sh foo.properties but it's not working.

connect-standalone.properties file contents:

vagrant@coton:~/kafka/kafka_2.13-2.8.0$ grep -v '^#'  config/connect-standalone.properties 

bootstrap.servers=localhost:9092

key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true

offset.storage.file.filename=/tmp/connect.offsets
offset.flush.interval.ms=10000

plugin.path=/home/vagrant/kafka/kafka_2.13-2.8.0/connectors

output when executing command:

vagrant@coton:~/kafka/kafka_2.13-2.8.0$ ./bin/connect-standalone.sh config/connect-standalone.properties 
[2021-07-24 21:33:40,383] INFO Usage: ConnectStandalone worker.properties connector1.properties [connector2.properties ...] (org.apache.kafka.connect.cli.ConnectStandalone:62)
vagrant@coton:~/kafka/kafka_2.13-2.8.0$ 

This INFO message is output by the program ConnectStandalone itself. It appears to be asking for another file, but in the example I am following there is no other file. What am I doing wrong? I am running on Java 8.

The video I'm following is https://www.youtube.com/watch?v=18gDPSOH3wU and at 7:53 he starts the distributed connect command with only one properties file (effectively connect-distributed.sh foo.properties). There's no "worker properties" file that I've seen mentioned elsewhere. And for him it works.

JL_SO
  • 1,742
  • 1
  • 25
  • 38
  • I believe the contents of your `connect-standalone.properties` file makes it a worker configuration file. If you look at the documentation here https://docs.confluent.io/3.2.2/connect/quickstart.html, you'll see your config file is almost identical to the sample worker configuration file. What exactly is your question though? It is just an INFO message, it isn't an ERROR or WARNING. – purple Jul 24 '21 at 21:00
  • @purple That's a fair point...but although it's INFO it's indicative of a usage error, and indeed if you look at the source (which I found since posting this question) it indicates the program prints the message and then exits - https://fossies.org/linux/kafka/connect/runtime/src/main/java/org/apache/kafka/connect/cli/ConnectStandalone.java. If you watch the video, you will see that "the program starts up" (or at least, a load of stuff happens). I think something is wrong - I'm not doing it right. – JL_SO Jul 24 '21 at 21:28
  • 1
    Hmm very interesting, good find. If you look at the source for the code the guy is using in the tutorial, you will see why this is happening. `ConnectDistributed.java` checks for `args.length < 1` whereas `ConnectStandalone.java` checks for `args.length < 2`. So that is why he is able to only use one properties file. I assume you will have to also pass a connector properties file in order to get your solution to work. – purple Jul 24 '21 at 21:37
  • @purple yep I just tried pasting my curl payload (that I had been using for debezium on docker .. the whole point of what I'm doing here is to try to ditch docker as it's a PITA that complicates things but all the tutorials use the damned thing) and that didn't work .. but it definitely went further...I have now something now that I think is progress..has fixed this particular issue at least.... many thanks. – JL_SO Jul 24 '21 at 21:47

2 Answers2

1

The answer provided at https://stackoverflow.com/a/53671106/4026629 usefully points out that you only need one properties file when you use connect-distributed; whereas when you use standalone you need two. It is unfortunate that this is not clearly signalled anywhere in any tutorials that I have seen.
Furthermore you need to use standard properties not the JSON format that you post using curls when using debezium via docker as is shown in virtually every tutorial it seems.

I managed to get further with this

name=pg-connector
connector.class=io.debezium.connector.postgresql.PostgresConnector
database.user=postgres
database.dbname=sport
transforms=unwrap
database.server.name=foo
database.port=5432
plugin.name=pgoutput
table.whitelist=public.event
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=false
decimal.handling.mode=string
database.hostname=1.2.3.4
database.password=
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
value.converter.schemas.enable=false
name=pg-sport-connector
transforms.unwrap.type=io.debezium.transforms.ExtractNewRecordState
value.converter=org.apache.kafka.connect.json.JsonConverter
database.whitelist=sport
key.converter=org.apache.kafka.connect.json.JsonConverter

Note that from experience database hostname IP might be localhost or something else (depending on whether you're running in docker, stuff like that).

JL_SO
  • 1,742
  • 1
  • 25
  • 38
0

[2021-07-24 21:33:40,383] INFO Usage: ConnectStandalone worker.properties connector1.properties [connector2.properties ...] (org.apache.kafka.connect.cli.ConnectStandalone:62)

This mean that you need connector1.properties. So you need a file that contain the information about the database you want to connect to. Create a file like this, place it into the config folder and write it into the command line like this:

./bin/connect-standalone.sh ./config/connect-standalone.properties ./config/this_file.properties 

More details about the topic here: Is there a equivalent Debezium command to starting Kafka Connect without Docker container

broch
  • 1
  • 2