2

I have a topic over which i am sending json in the following format:

 {
  "schema": {
   "type": "string",
   "optional": true
  },
  "payload": “CustomerData{version='1', customerId=‘76813432’,      phone=‘76813432’}”
} 

and I would like to create a stream with customerId, and phone but I am not sure how to define the stream in terms of a nested json object. (edited)

CREATE  STREAM customer (
    payload.version VARCHAR,
    payload.customerId VARCHAR,
    payload.phone VARCHAR
  ) WITH (
    KAFKA_TOPIC='customers',
    VALUE_FORMAT='JSON'
  );    

Would it be something like that? How do I de-reference the nested object in defining the streams field?

Actually the above does not work for field definitions it says:

Caused by: line 2:12: 
extraneous input '.' expecting {'EMIT', 'CHANGES',
'INTEGER', 'DATE', 'TIME', 'TIMESTAMP', 'INTERVAL', 'YEAR', 'MONTH', 'DAY',
Roger Alkins
  • 125
  • 11

1 Answers1

3

Applying function extractjsonfield

There is an ksqlDB function called extractjsonfield that you can use.

First, you need to extract the schema and payload fields:

CREATE STREAM customer (
  schema VARCHAR,
  payload VARCHAR
) WITH (
    KAFKA_TOPIC='customers',
    VALUE_FORMAT='JSON'
); 

Then you can select the nested fields within the json:

SELECT EXTRACTJSONFIELD(payload, '$.version') AS version FROM customer;

However, it looks like your payload data does not have a valid JSON format.


Applying a STRUCT schema

If your entire payload is encoded as JSON string, which means your data looks like:

{
  "schema": {
   "type": "string",
   "optional": true
  },
  "payload": {
    "version"="1",
    "customerId"="76813432",
    "phone"="76813432"
  }
} 

you can define STRUCT as below:

CREATE STREAM customer (
  schema STRUCT<
    type VARCHAR,
    optional BOOLEAN>,
  payload STRUCT<
    version VARCHAR,
    customerId VARCHAR,
    phone VARCHAR>
) 
WITH (
    KAFKA_TOPIC='customers',
    VALUE_FORMAT='JSON'
);

and finally referencing individual fields can be done like this:

CREATE STREAM customer_analysis AS
SELECT
  payload->version as VERSION,
  payload->customerId as CUSTOMER_ID,
  payload->phone as PHONE
FROM customer
EMIT CHANGES;
Michael Heil
  • 16,250
  • 3
  • 42
  • 77