3

As of now i am doing something like this reading avsc file to get schema

value_schema = avro.load('client.avsc')

can i do something to get schema from confluent schema registry using topic-name?

i found one way but didn't figure out how to use it.

https://github.com/marcosschroh/python-schema-registry-client

Giorgos Myrianthous
  • 36,235
  • 20
  • 134
  • 156
Mohit Singh
  • 401
  • 1
  • 10
  • 30

3 Answers3

17

Using confluent-kafka-python

from confluent_kafka.avro.cached_schema_registry_client import CachedSchemaRegistryClient

sr = CachedSchemaRegistryClient({
    'url': 'http://localhost:8081',
    'ssl.certificate.location': '/path/to/cert',  # optional
    'ssl.key.location': '/path/to/key'  # optional
})

value_schema = sr.get_latest_schema("orders-value")[1]
key_schema= sr.get_latest_schema("orders-key")[1]

Using SchemaRegistryClient

Getting schema by subject name

from schema_registry.client import SchemaRegistryClient


sr = SchemaRegistryClient('localhost:8081')
my_schema = sr.get_schema(subject='mySubject', version='latest')

Getting schema by ID

from schema_registry.client import SchemaRegistryClient


sr = SchemaRegistryClient('localhost:8081')
my_schema = sr.get_by_id(schema_id=1)
Giorgos Myrianthous
  • 36,235
  • 20
  • 134
  • 156
  • 1
    Precisely. Just a note to add that typically the subject for a topic will be -key or -value depending on which bit of the message you are reading. Also, you may instead (the confluent serde, as I understand it) encode the actual schema id into the serialised message somehow and you access the schema by that rather than subject and version. – w08r Feb 29 '20 at 23:07
  • 2
    @wobr Actually you can omit `version`. In that case you will get the latest subject schema. – Giorgos Myrianthous Feb 29 '20 at 23:10
  • 1
    Latest is not always what you want. – w08r Mar 01 '20 at 08:52
  • 1
    @wobr the confluent_kafka Python library handles the "somehow" of encoding the message with the ID – OneCricketeer Mar 01 '20 at 15:03
  • @wobr - So you mean to say for each record, get the schema from schema registry? Will it not be slow down the performance ? – Rohi_Dev_1.0 Feb 16 '21 at 06:58
  • I imagine it's cached – w08r Feb 20 '21 at 11:25
4

you can use get_latest_version function to get schema information

from confluent_kafka.schema_registry import SchemaRegistryClient

sr = SchemaRegistryClient({"url": 'http://localhost:8081'})
subjects = sr.get_subjects()
for subject in subjects:
    schema = sr.get_latest_version(subject)
    print(schema.version)
    print(schema.schema_id)
    print(schema.schema.schema_str)
alirezaSafi
  • 161
  • 1
  • 2
0

I did like this it worked for me

     import requests
     import os

     SCHEMA_REGISTRY_URL = os.getenv('SCHEMA_REGISTRY_URL');
     print("SCHEMA_REGISTRY_URL: ", SCHEMA_REGISTRY_URL)
     URL = SCHEMA_REGISTRY_URL + '/subjects/' + topic + '/versions/latest/schema'
     r = requests.get(url=URL)
     schema = r.json()


     print("Schema From Schema Registry ==========================>>")
     print("Schema: ", schema)

enter image description here

Mohit Singh
  • 401
  • 1
  • 10
  • 30