2

I have this kind of data in a Kafka Topic:

{..., fields: { "a": "aval", "b": "bval" } }

If I create a Kafka Engine table, I get an error when using a field definition like this:

fields String

because it (correctly) doesn't recognize it as a String:

2018.07.09 17:09:54.362061 [ 27 ] <Error> void DB::StorageKafka::streamThread(): Code: 26, e.displayText() = DB::Exception: Cannot parse JSON string: expected opening quote: (while read the value of key fields): (at row 1)

As ClickHouse does not currently have a Map or JSONObject type, what would be the best way to work over it, provided I don't know in advance the name of the inner fields ("a" or "b" in the example - so I cannot see Nested structures helping)?

xmar
  • 1,729
  • 20
  • 48
  • Have you tried removing all the intermediate spaces of the original JSON string? iirc, Clickhouse JSON parser doesn't allow that. – Amos Jul 10 '18 at 03:29
  • Sorry, just added spaces for example clarity. Data in Kafka message is actually compact-printed. – xmar Jul 11 '18 at 08:47
  • see three possible ways to resolve it in answer to the question [ClickHouse JSON parse exception: Cannot parse input: expected ',' before](https://stackoverflow.com/a/65105292/303298) – vladimir Dec 02 '20 at 17:49

1 Answers1

0

Apparently, at the moment ClickHouse does not support complex JSON parsing.

From this answer in ClickHouse Github:

Clickhouse uses quick and dirty JSON parser, which does not how to read complex deep structures. So it can't skip that field as it does not know where that nested structure ends. Sorry. :/

So you should preprocess your json with some external tools, of you can contribute to Clickhouse and improve JSON parser.

xmar
  • 1,729
  • 20
  • 48