2

I currently have this regex:

?P<key>\w+)=(?P<value>[a-zA-Z0-9-_:/@. ]+

Input row 1: event=1921;json={"source":"A","location":B":"folder":"c:\\windows\\system32"},"id":2,"address":null,"name":"gone";

Input row 2: dev=b;json={"dest":"123","home":AZ":"loc":"sys"},"ab":9,"home":null,"someKey":"someValue";

It correctly extracts the "event=1921;" but does extract the two other types.

  1. How do I extract the "json={...}" using Key (JSON) and Value?
  2. How do I extract "name":"gone" using Key (Name) and Value (gone)? The solution needs to be dynamic since key fields will be named differently in other rows.
Just J
  • 63
  • 6

1 Answers1

2

you should be able to use the parse operator: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/parseoperator

for example:

print input = 'event=1921;json={"source":"A","location":B":"folder":"c:\\windows\\system32"},"id":2,"address":null,"name":"gone";'
| parse input with * "json=" json:dynamic ',"id"' * '"name":"' name '"' *

if your payload / property names are entirely dynamic, then:

a. I would recommend you evaluate your options to structure the source data in standard format (currently, even the "json" part of it isn't valid JSON)

b. you could try the following - functional, but very inefficient (not recommended for large scale data procesing)

datatable(input:string)
[
    'event=1921;json={"source":"A","location":B":"folder":"c:\\windows\\system32"},"id":2,"address":null,"name":"gone";',
    'dev=b;json={"dest":"123","home":AZ":"loc":"sys"},"ab":9,"home":null,"someKey":"someValue";'
]
| parse input with prefix ";json={" json:dynamic '},' suffix
| mv-apply x = extract_all(@'(\w+)=(\w+)', prefix) on (
    project p = pack(tostring(x[0]), x[1])
    | summarize b1 = make_bag(p)
)
| mv-apply y = extract_all(@'"(\w+)":"?(\w+)"?', suffix) on (
    project p = pack(tostring(y[0]), y[1])
    | summarize b2 = make_bag(p)
)
| project json = strcat("{", json, "}"), b = bag_merge(b1, b2)
| evaluate bag_unpack(b)
Yoni L.
  • 22,627
  • 2
  • 29
  • 48
  • Interesting! So what if there is no 'logical' name for the Key field since it needs to be dynamic? Ex. "Id" and "Name" could have been "x1" – Just J Jan 25 '21 at 22:30
  • that's some bad format you're coerced to work with... it will lead to inefficiencies at query time. i've updated my reply with a functional, but not too-efficient solution. you should consider reformatting your source data to a standard format – Yoni L. Jan 25 '21 at 22:52