2

I load s3 connector with the following parameters:

confluent load s3-sink
{
  "name": "s3-sink",
  "config": {
    "connector.class": "io.confluent.connect.s3.S3SinkConnector",
    "tasks.max": "1",
    "topics": "s3_topic",
    "s3.region": "us-east-1",
    "s3.bucket.name": "some_bucket",
    "s3.part.size": "5242880",
    "flush.size": "1",
    "storage.class": "io.confluent.connect.s3.storage.S3Storage",
    "format.class": "io.confluent.connect.s3.format.json.JsonFormat",
    "schema.generator.class": "io.confluent.connect.storage.hive.schema.DefaultSchemaGenerator",
    "partitioner.class": "io.confluent.connect.storage.partitioner.FieldPartitioner",
    "schema.compatibility": "NONE",
    "partition.field.name": "f1",
    "key.converter": "org.apache.kafka.connect.json.JsonConverter",
    "value.converter": "org.apache.kafka.connect.json.JsonConverter",
    "key.converter.schemas.enable": "false",
    "value.converter.schemas.enable": "false",
    "name": "s3-sink"
  },
  "tasks": [
    {
      "connector": "s3-sink",
      "task": 0
    }
  ],
  "type": null
}

Next I send it with kafka-console-producer JSON:

{"f1":"partition","data":"some data"}

And I get the following error in the connect log:

[2018-05-16 16:32:05,150] ERROR Value is not Struct type. (io.confluent.connect.storage.partitioner.FieldPartitioner:67)
[2018-05-16 16:32:05,150] ERROR WorkerSinkTask{id=s3-sink-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not re
cover until manually restarted. (org.apache.kafka.connect.runtime.WorkerSinkTask:515)
io.confluent.connect.storage.errors.PartitionException: Error encoding partition.

I remember it worked some time ago.
Now I use Confluent Open Source v. 4.1

Phantômaxx
  • 37,901
  • 21
  • 84
  • 115
Vova l
  • 213
  • 6
  • 15

1 Answers1

4

As of Confluent 4.1 releases FieldPartitioner does not support JSON field extraction.

You could instead use kafka-avro-console-producer to send the same JSON blob with a Avro Schema, then it should work

Here is the property you want to use

--property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"f1","type":"string"},{"name":"data","type":"string"}]}'

Then you can send

{"f1":"partition","data":"some data"}

And you'll need to use these properties in Connect

"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schemas.enable": "true",
OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • Thanks! How can I do this with Java Kafka Driver? – Vova l May 17 '18 at 05:50
  • Sending Avro? Checkout the examples here. https://github.com/confluentinc/examples/tree/4.0.x/kafka-clients – OneCricketeer May 17 '18 at 13:59
  • Hey @cricket_007, do you have an example for partition.field.name (in .properties format), using several partitions, based in the field name (example: field1=value/field2=value/field3=value)? Thanks – Julia Bel Feb 19 '20 at 14:39
  • @Jul `partition.field.name=field1,field2,field3`. Again, though, please use Connect Distributed mode – OneCricketeer Feb 19 '20 at 14:48