The INPUT_STREAM
in Kafka was created with ksql statement below:
CREATE STREAM INPUT_STREAM (year STRUCT<month STRUCT<day STRUCT<hour INTEGER, minute INTEGER>>>) WITH (KAFKA_TOPIC = 'INPUT_TOPIC', VALUE_FORMAT = 'JSON');
It defines four levels nested json schema with fields year
, month
and day
and hour
and minute
, like the one below:
{
"year": {
"month": {
"day": {
"hour": string,
"minute": string
}
}
}
}
I want to create a second OUTPUT_STREAM
that will read the messages from INPUT_STREAM
and re-map its field names to some custom ones. I want to grab the hour
and minute
values and place them in a nested json below the fields one
and two
, like this one below:
{
"one": {
"two": {
"hour": string,
"minute": string
}
}
}
I go ahead and put together ksql statement to create OUTPUT_STREAM
CREATE STREAM OUTPUT_STREAM WITH (KAFKA_TOPIC='OUTPUT_TOPIC', REPLICAS=3) AS SELECT YEAR->MONTH->DAY->HOUR ONE->TWO->HOUR FROM INPUT_STREAM EMIT CHANGES;
The statement fails with an error. Is there a syntax error in this statement? Is it be possible to specify the destination field name like I do here with
...AS SELECT YEAR->MONTH->DAY->HOUR ONE->TWO->HOUR FROM...
?
I've tried to use STRUCT
instead of ONE->TWO->HOUR
:
CREATE STREAM OUTPUT_STREAM WITH (KAFKA_TOPIC='OUTPUT_TOPIC', REPLICAS=3) AS SELECT YEAR->MONTH->DAY->HOUR ONE STRUCT<TWO STRUCT<HOUR VARCHAR>> FROM INPUT_STREAM EMIT CHANGES;
It errors out too and doesn't work