0

There's an external list of malicious domains/URL's, and I want to periodically search the logs, but there's an obvious problem:

let abuse_domain = (externaldata(sentinel_domain: string )
[@"https://managedsentinel.com/downloads/covid19_domains.txt"]
with (format="txt"))
| where sentinel_domain !startswith "#"
| project sentinel_domain;
abuse_domain
| join 
(
DeviceNetworkEvents
| where Timestamp > ago(1h) 
) on $left.sentinel_domain == $right.RemoteUrl
| project Timestamp,DeviceName,RemoteUrl,DeviceId,ReportId

The On clause isn't going to work because the two items will never completely match. How can I get a match when $left.sentinel_domain is a substring of $rightRemoteUrl ?

Mullets4All
  • 39
  • 1
  • 8

1 Answers1

0

Try using parse_url to extract the domain (Host) from RemoteUrl first.

Like so:

let abuse_domain = (externaldata(sentinel_domain: string )
[@"https://managedsentinel.com/downloads/covid19_domains.txt"]
with (format="txt"))
| where sentinel_domain !startswith "#"
| project sentinel_domain;
abuse_domain
| join 
(
DeviceNetworkEvents
| where Timestamp > ago(1h)
| extend Host = tostring(parse_url(RemoteUrl).Host)
) on $left.sentinel_domain == $right.Host
| project Timestamp,DeviceName,RemoteUrl,DeviceId,ReportId
Jonathan Myers
  • 789
  • 4
  • 12
  • Thanks for pointing me in the right direction - what's the best way to do that? It seems clear that I need to extract the url before the join, but if I insert this line: let parsedurl = tostring(parseurl(abuse_domain).Host) – Mullets4All Jan 11 '21 at 15:48
  • Sorry I took too long to edit the above, it should have said: Thanks for pointing me in the right direction - what's the best way to do that? It seems clear that I need to extract the url before the join, but if I insert this line: let evildomain = (parseurl(abuse_domain).Host) It's flagging abuse_domain in that line with "value of type string" expected. But isn't it a string? If I try to wrap abuse_domain in tostring, it's "Scalar value expected". – Mullets4All Jan 11 '21 at 15:56
  • You have to cast values extracted from a `dynamic` value before they can be used (I think because Kusto can't know at query parse time what the type of the extracted value will be). Try `let evildomain = tostring(parse_url(abuse_domain).Host)` – Jonathan Myers Jan 12 '21 at 17:07
  • When I do that it errors and for abuse_domain it says "A value of string expected". It's like a weird chicken and egg problem. – Mullets4All Jan 13 '21 at 18:16
  • Aha, I see what went wrong now. You're trying to parse `abuse_domain`, but that's a whole table. What you want is to `extend` `DeviceNetworkEvents` _inside_ the join with the host extracted from its `RemoteUrl` column. I've edited my answer with how that should look. Sorry for the slight misdirect. – Jonathan Myers Jan 14 '21 at 20:58
  • You might also want to add `kind=inner` right after the `join` keyword -- the default might not be what you expect. See https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/joinoperator?pivots=azuredataexplorer#join-flavors – Jonathan Myers Jan 14 '21 at 21:01
  • 1
    And you might also need to make it `tolower(tostring(parse_url(RemoteUrl).Host))` since joins are case-sensitive – Jonathan Myers Jan 14 '21 at 21:05