5

I load my data from kafka to oracle with conluent jdbc-sink.

But I write my schema on value with data.

I do not want to write schema with data , how can I write schema on kafka topic and then I want to send just data from my client?

thanks in advance

json data

{
    "schema": {
        "type": "struct",
        "fields": [
            {
                "field": 'ID',
                "type": "int32",
                "optional": False
            },
            {
                "field": 'PRODUCT',
                "type": "string",
                "optional": True
            },
            {
                "field": 'QUANTITY',
                "type": "int32",
                "optional": True
            },
            {
                "field": 'PRICE',
                "type": "int32",
                "optional": True
            }
        ],
        "optional": True,
        "name": "myrecord"
    },
    "payload": {
        "ID": 1071,
        "PRODUCT": 'ersin',
        "QUANTITIY": 1071,
        "PRICE": 1453
   }

python code:

producer.send(topic, key=b'1071'
              , value=json.dumps(v, default=json_util.default).encode('utf-8'))

how can I solve this?

thanks in advance

Robin Moffatt
  • 30,382
  • 3
  • 65
  • 92
CompEng
  • 7,161
  • 16
  • 68
  • 122
  • You need to publish message with schema from producer and also need to enable schema converter for sink message using kafka-connect. – Atif Apr 15 '20 at 13:14
  • I do not want to send schema with data everytime , it is ok when I send schema and payload together – CompEng Apr 15 '20 at 13:21
  • It is not possible to sink data using kafka-connect without a schema. Without a schema, kafka-connect will fail to map with database table columns. And you do not need to send schema every time because schema registry will register or store schemas in _schemas topic in broker and bind with that schema with a specific topic. – Atif Apr 15 '20 at 13:27
  • thanks for answer so how can I use schema register for specific one topic , my schema is json – CompEng Apr 15 '20 at 13:30
  • When you publish your object, you need to set ''io.confluent.kafka.serializers.KafkaJsonSchemaSerializer.class'' for serializer in producer and need to enable json schema converter in kafka sink connector property file. – Atif Apr 15 '20 at 13:37
  • my producer is python, so I set schema registry for my json for one time – CompEng Apr 15 '20 at 13:42
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/211720/discussion-between-ersin-gulbahar-and-atif). – CompEng Apr 15 '20 at 13:42

1 Answers1

5

If you want to use the JDBC sink connector you must provide a schema. This can be achieved in three ways:

  • Use JSON with schemas enabled
  • Use Avro and Schema Registry
  • Use JSON schemas with Schema Registry

You are currently using JSON with schemas enabled that requires to send the schema along with the actual payload. The only way you can achieve your requirement is by using Avro and Confluent Schema Registry so that your schemas are registered in the schema registry. In this way, you won't be required to send the payload schema every time.

Another option would be to use JSON with schema registry (#1289). For Kafka Connect you can use JsonSchemaConverter and for Java Consumers and Producers you can use KafkaJsonSchemaSerializer and KafkaJsonSchemaDeserializer.

Giorgos Myrianthous
  • 36,235
  • 20
  • 134
  • 156