0

Status': 'Failure: DNS resolution failed: Rcode Domain(3)', 'CheckedTime': datetime.datetime(2017, 2, 1, 14, 47, 38, 382000, tzinfo=tzlocal())}}, {'Region': 'us-east-1', 'IPAddress': '01.000.2.12', 'StatusReport': {'Status': 'Success: DNS resolution Success: Rcode Domain(3)', 'CheckedTime': datetime.datetime(2017, 2, 1, 14, 47, 35, 371000, tzinfo=tzlocal())}}, {'Region': 'us-west-1', 'IPAddress': '01.000.14.10', 'StatusReport': {'Status': 'Failure: DNS resolution failed: Rcode Domain(3)', 'CheckedTime': datetime.datetime(2017, 2, 1, 14, 47, 34, 715000, tzinfo=tzlocal())}}, {'Region': 'us-west-2', 'IPAddress': '01.000.22.10', 'StatusReport': {'Status': 'Failure: DNS resolution failed: Rcode Domain(3)', 'CheckedTime': datetime.datetime(2017, 2, 1, 14, 47, 42, 801000, tzinfo=tzlocal())}}, {'Region': 'us-west-2', 'IPAddress': '01.000.18.10', 'StatusReport': {'Status': 'Failure: DNS resolution failed: Rcode Domain(3)', 'CheckedTime': datetime.datetime(2017, 2, 1, 14, 47, 25, 189000, tzinfo=tzlocal())}}, {'Region': 'us-east-1', 'IPAddress': '01.000.1.10', 'StatusReport': {'Status': 'Failure: DNS resolution failed: Rcode Domain(3)', 'CheckedTime': datetime.datetime(2017, 2, 1, 14, 47, 42, 293000, tzinfo=tzlocal())}}]}

Problem:

I need to find any failure in the string and the associated message and it should not look for any success in the message.

Status': 'Failure: DNS resolution failed: Rcode Domain(3)', 'CheckedTime': datetime.datetime(2017, 2, 1, 14, 47, 38, 382000, tzinfo=tzlocal())}}, {'Region': 'us-east-1', 'IPAddress': '01.000.2.12', 'Status': 'Failure: DNS resolution failed: Rcode Domain(3)', 'CheckedTime': datetime.datetime(2017, 2, 1, 14, 47, 34, 715000, tzinfo=tzlocal())}}, {'Region': 'us-west-2', 'IPAddress': '01.000.22.10' etc.

What I tried:

Status':.+Failure.*(?=Success)

and

'Status':.+

but it doesn't give me what I want.

Please help!!

Aaron
  • 24,009
  • 2
  • 33
  • 57
RMish
  • 131
  • 2
  • 11
  • 2
    Looks like you're trying to parse JSON, which is a language complex enough to warrant its own specialized parser rather than generic regexs. That said, your sample is simple enough that `{'Status':\s*'Failure[^}]*}` seems to match all your failures ([regex101](https://regex101.com/r/fiooss/2)) – Aaron Feb 06 '20 at 15:51
  • 3
    But really you shouldn't expect it to work in all cases, and something that works most of the time is often more dangerous than something that plainly doesn't work as it gives a false sense of security. If you disclose more info about your environment (are you parsing this data from a specific language, shell or tool?) we might be able to guide you into using a proper JSON parser than would extract all failures properly – Aaron Feb 06 '20 at 15:55
  • thank you very much for your reply, this is coming as an event data from the cloudwatch logs to Splunk. – RMish Feb 06 '20 at 18:21
  • I've added the Splunk tag so that people familiar with the tool might answer the question. From [what I read](https://answers.splunk.com/answers/100575/splunk-rest-api-json-parsing.html) splunk natively parses JSON, but I might miss some technicalities – Aaron Feb 06 '20 at 20:30

1 Answers1

1

One main issue is that the data is similar too, but not quite JSON. Splunk will handle JSON pretty well, either at index time, or with a command such as spath.

Given that your sample data isn't JSON, we need to fall back to regular expressions.

This is a pretty basic regular expression that extracts everything from an initial { up to a double }}, which matches your data. (?m) and max_match=0 tell Splunk to match as many times as possible.

| rex max_match=0 field=raw "(?m)(?<r>{.*}})"

Now that Splunk has matched each entry in the event, we can split them into separate events, and drop the full event.

| mvexpand r | fields - raw

Next, do a rex over each entry, extracting just the status_msg

| rex field=r "'Status': '(?<status_msg>[^']+)'"

Finally, drop the events/rows where the status_msg contains Success

| where NOT status_msg LIKE "%Success%"

Here is an example of the regular expressions working over your data.

| makeresults | eval raw="{[{'Region': 'us-east-1', 'IPAddress': '01.000.2.12', 'StatusReport': {'Status': 'Success: DNS resolution Success: Rcode Domain(3)', 'CheckedTime': datetime.datetime(2017, 2, 1, 14, 47, 35, 371000, tzinfo=tzlocal())}},
  {'Region': 'us-west-1', 'IPAddress': '01.000.14.10', 'StatusReport': {'Status': 'Failure: DNS resolution failed: Rcode Domain(3)', 'CheckedTime': datetime.datetime(2017, 2, 1, 14, 47, 34, 715000, tzinfo=tzlocal())}},
  {'Region': 'us-west-2', 'IPAddress': '01.000.22.10', 'StatusReport': {'Status': 'Failure: DNS resolution failed: Rcode Domain(3)', 'CheckedTime': datetime.datetime(2017, 2, 1, 14, 47, 42, 801000, tzinfo=tzlocal())}}, 
  {'Region': 'us-west-2', 'IPAddress': '01.000.18.10', 'StatusReport': {'Status': 'Failure: DNS resolution failed: Rcode Domain(3)', 'CheckedTime': datetime.datetime(2017, 2, 1, 14, 47, 25, 189000, tzinfo=tzlocal())}},
  {'Region': 'us-east-1', 'IPAddress': '01.000.1.10', 'StatusReport': {'Status': 'Failure: DNS resolution failed: Rcode Domain(3)', 'CheckedTime': datetime.datetime(2017, 2, 1, 14, 47, 42, 293000, tzinfo=tzlocal())}}]}"
| rex max_match=9999 field=raw "(?m)(?<r>{.*}})"
| mvexpand r | fields - raw
| rex field=r "'Status': '(?<status_msg>[^']+)'"
| where NOT status_msg LIKE "%Success%"
Simon Duff
  • 2,631
  • 2
  • 7
  • 15