2

I am using Flink Table API. I have a table definition that I want to select all fields and convert them to a JSON string in a new field.

My table has three fields; a: String, b: Int, c: Timestamp.

If I do

INSERT INTO kinesis
SELECT a, b, c from my_table

The kinesis stream has json records;

{
  "a" : value,
  "b": value,
  "c": value
}

However, I want something similar to Spark's functions;

INSERT INTO kinesis
SELECT "constant_value" as my source, to_json(struct(*)) as playload from my_table

So, expected result is;

{
  "my_source": "constant_value",
  "payload": "json string from the first example that has a,b,c"
}

I can't see any to_json or struct() functions in Flink. Is it possible to implement?

lalala
  • 63
  • 1
  • 6

2 Answers2

2

you might have to implement your own user-defined aggregate function.

this is what i did, here i assume the input to the UDF looks like

to_json('col1', col1, 'col2', col2)

public class RowToJson extends ScalarFunction {
    public String eval(@DataTypeHint(inputGroup = InputGroup.ANY) Object... row) throws Exception {
        if(row.length % 2 != 0) {
            throw new Exception("Wrong key/value pairs!");
        }

        String json = IntStream.range(0, row.length).filter(index -> index % 2 == 0).mapToObj(index -> {
            String name = row[index].toString();
            Object value = row[index+1];
            ... ...
        }).collect(Collectors.joining(",", "{", "}"));
        return json;
    }
}

if you expect udf could be used for group by, you have to extend your udf class from AggregateFunction

public class RowsToJson extends AggregateFunction<String, List<String>>{
    @Override
    public String getValue(List<String> accumulator) {
        return accumulator.stream().collect(Collectors.joining(",", "[", "]"));
    }

    @Override
    public List<String> createAccumulator() {
        return new ArrayList<String>();
    }

    public void accumulate(List<String> acc, @DataTypeHint(inputGroup = InputGroup.ANY) Object... row) throws Exception {
        if(row.length % 2 != 0) {
            throw new Exception("Wrong key/value pairs!");
        }
        String json = IntStream.range(0, row.length).filter(index -> index % 2 == 0).mapToObj(index -> {
            String name = row[index].toString();
            Object value = row[index+1];
            ... ...
        }).collect(Collectors.joining(",", "{", "}"));
        acc.add(json);
    }

}
Grant
  • 500
  • 1
  • 5
  • 18
0

From Flink 1.15, JSON_OBJECT SQL function can help you to create JSON field from individual columns: flink json functions

SELECT JSON_OBJECT('col1' value col1, 'col2' value col2) FROM table