0

I need help in writing regex for the below mentioned log:

URLReputation: Risk unknown, URL: http://facebook.com

I wrote a regex like below:

URLReputation\:\s*(.*?),\s*URL\:\s*(.*)

Here everything is working. But in case URL isn't there, the URLReputation also will not be captured.

Please help.

Regards,

Mitesh Agrawal

Mitesh Agrawal
  • 67
  • 3
  • 12

2 Answers2

2

You could turn the non greedy .*? into a negated character class [^,]+ and match any char except a comma. Then make the URL part optional using an optional non capturing group (?:...)?

You want to capture the value of a url using .* but that could possibly also match an empty string.

You might make the pattern more specific by matching at least a single non whitespace char \S+ or use a pattern like for example specifying the start https?://\S+

URLReputation:\s*([^,]+)(?:,\s*URL:\s*(\S+))?

Regex demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
0

Assuming the string ends immediately before the comma when the "URL isn't there", you can simply put the comma and what follows in an optional non-capture group and add an end-of-line anchor:

/URLReputation: +(.*?)(?:, +URL:\ +(.*))?$/

Demo

Mainly to improve readability, I changed each \s to a space as it appears that spaces are the only whitespace characters you wish to match.

Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100