0

I'm using Kafka HDFS Connect.

I want to write Parquet files from JSON from Kafka topic.

I want to create JSON with "schema", "payload" as follow (from SO Question):

{
"schema": {
    "type": "struct",
    "fields": [{
        "type": "int32",
        "optional": true,
        "field": "c1"
    }, {
        "type": "string",
        "optional": true,
        "field": "c2"
    }, {
        "type": "int64",
        "optional": false,
        "name": "org.apache.kafka.connect.data.Timestamp",
        "version": 1,
        "field": "create_ts"
    }, {
        "type": "int64",
        "optional": false,
        "name": "org.apache.kafka.connect.data.Timestamp",
        "version": 1,
        "field": "update_ts"
    }],
    "optional": false,
    "name": "foobar"
},
"payload": {
    "c1": 10000,
    "c2": "bar",
    "create_ts": 1501834166000,
    "update_ts": 1501834166000
}
}

There is an automatic tool was create schema from JSON with types of kafka connect?

My properties looks like:

connector.class=io.confluent.connect.hdfs.HdfsSinkConnector
flush.size=3
format.class=io.confluent.connect.hdfs.parquet.ParquetFormat
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
schema.compatability=BACKWARD
key.converter.schemas.enabled=false
value.converter.schemas.enabled=false
schemas.enable=false

What should I change/add after I'm create the schema?

Thanks

Community
  • 1
  • 1
Ya Ko
  • 509
  • 2
  • 4
  • 19
  • Though this question is off-topic to stackoverflow, I recently came across [a website that can generate json schema from json input](https://app.quicktype.io/#l=cs&r=json2csharp) (among other things). It's not a perfect solution but it can give you a pretty good schema to start working with. – Zohar Peled Aug 15 '18 at 08:38
  • 1
    Did you try `value.schemas.enabled=true`?? – OneCricketeer Aug 17 '18 at 14:25

0 Answers0