1

I have a case in which the client sends us completely dynamic, unexpected data. We want to do some transformations to the data, but index the dynamic data as it is.

Each entry in that JSON can be of a different type, so MAP<VARCHAR, VARCHAR> cannot be used.

The entries' content (field names) also is not known in advanced, so also STRUCT or SELECT EXTRACTJSON is not a viable solution.

For example, an entry may look like this:

{
  "id": "2345",
  "name": "sunrise",
  "fields": {
    "price": 123,
    "currency": "usd",
    "transaction_time": "2021-01-03T11:20:00.000Z",
    "abc": "some string"
  }
}

Later on, we might get an update to that entry, and it will look like this:

{
  "id": "2345",
  "name": "sunrise",
  "fields": {
    "price": 433,
    "currency": "eur",
    "transaction_time": "2021-01-04T12:34:00.000Z",
    "abc": 244,
    "extra_field": "pending"
  }
}

Notice that abc's type can be changed, and that we can't really SELECT the fields as we don't know what they will be. This is data that we get from a MongoDB source connector - it is NoSQL with no schema.

The expected output is with a transformation and fields are written exactly as sent:

{
  "id": "2345",
  "nameTransformed": "Sunrise-Miami",
  "fields": {
    "price": 433,
    "currency": "eur",
    "transaction_time": "2021-01-04T12:34:00.000Z",
    "abc": 244,
    "extra_field": "pending"
  }
}
SharonReytan
  • 73
  • 1
  • 7
  • You can index dynamic arbitrary data using mongodb wildcard indexes. But how do you intend to *transform* arbitrary data? – Buzz Moschetti May 18 '22 at 12:59
  • my goal - is to save that dynamic json as it is. in mongo it's indeed no issue, but with ksqldb in the middle i'm in trouble here. i want to transform other parts in the json, and write `fields` as a json – SharonReytan May 18 '22 at 14:44
  • OK -- so it is safe to say this isn't actually a mongodb issue; it is ksqldb issue? – Buzz Moschetti May 18 '22 at 14:48
  • right, how can i write those fields, exactly as they are. will remove the mongodb tag – SharonReytan May 18 '22 at 14:57
  • Can you please specify what would you like to do with these records? What kind of transformations would you like to do? Or the question is about creating a table/stream for the dynamic data? Are there any fields that always present in correct records, like `id` and `fields`? Is there a primary key field? – Alex Sorokoumov May 24 '22 at 09:42
  • id and name will always be in the doc and will be transformed. the other property of fields entry is a dynamic json that i’d like to forward exactly as it is. – SharonReytan May 25 '22 at 14:35

0 Answers0