0

I am ingesting a log that ends up putting a lot of good data in the "AdditionalExtensions" field. This is an example of the data:

Action=Allow;User=test.test@test.com;SrcIpAddr=192.168.1.146;SrcPortNumber=64694

I am trying to figure out if there is a way to, once I split(AdditionalExtensions,";") to convert the string array that the split() function produces into a property bag, and then loop through its keys, sort of what in python would be:

for k,v in mydict:
  print(f'{k}= {v}')

but of course I would have to extend its key into for example, Action, User, SrcIpAddr, SrcPortNumber so I can ultimately end up with something like:

| extend Action = loopThrough1stIteminDict[v],
         User = loopThrough2ndIteminDict[v]
| project Action, User

And the end result be:

Action   User

Allow    test.test@test.com

if that's not possible, what's more efficient in KQL, to use this:

| extend parser = split(AdditionalExtensions,";")
| extend Action = extract("=(.*)",1,tostring(parser[0])),

or instead of using extract(), use substring(), but using indexof() to tell the substring I want to start at the index where the equal sign is, and go to the end of the string

There is about 30-40 fields in the AdditionalExtensions field that I am looking to do this thorough for an important log that I may need to go back in time a lot, I don't want a query that goes back 2 hours to stall, let alone have to go back 7 days or more, KQL fails a lot when going back in time, nothing like Splunk of course, but I am working on this product now.

Rakim
  • 167
  • 11

1 Answers1

4

Ideally, you'll adjust the source to write semi-structured data in a supported and standard format - JSON - instead of semicolon/equals-sign-separated key-value pairs. This would allow you to avoid inefficient query-time parsing of the raw data whenever you query it.

Ignoring performance and efficiency, you could still achieve that kind of parsing as follows:

datatable (s:string)
[
    'Action=Allow;User=test.test@test.com;SrcIpAddr=192.168.1.146;SrcPortNumber=64694',
    'Action=Deny;User=test.test@test2.com;SrcIpAddr=192.168.1.147;SrcPortNumber=64695',
    'Action=Allow;User=test.test@test3.com;SrcIpAddr=192.168.1.148;SrcPortNumber=64696'
]
| mv-apply s = split(s, ";") on (
    parse s with key "=" value
    | summarize b = make_bag(pack(key, value))
)
| evaluate bag_unpack(b)
Action SrcIpAddr SrcPortNumber User
Allow 192.168.1.146 64694 test.test@test.com
Deny 192.168.1.147 64695 test.test@test2.com
Allow 192.168.1.148 64696 test.test@test3.com
Yoni L.
  • 22,627
  • 2
  • 29
  • 48
  • this is a super awesome, succinct solution, thanks! I noticed you do a lot of work for the Microsoft folks, keep it up! – Rakim Sep 28 '21 at 19:15