1

One of these days I'll learn regex.

I have the following filename

PE-run1000hbgmm3f1-job1000hbgmm3dt-Output-Workflow-1000hbgmm3fb-22.07.17.log

I'm able to get this to work so...

(?<logtype>[^-]+)-(?<run_id>[^-]+)-(?<job_id>[^-]+)-(?<capability>[^(0-9\.0-9\.0-9)]+)

logtype: PE
run_id: run1000hbgmm3f1
job_id: job1000hbgmm3dt

But I'm getting

capability: Output-Workflow-

...though I want it to be

capability: Output-Workflow-1000hbgmm3fb

...that is, all the text after the job_id up to the timestamp HH.mm.ss. Any help please? Thanks!

Chris F
  • 14,337
  • 30
  • 94
  • 192

1 Answers1

0

It is because you cannot negate a sequence of symbols with a negated character class. [^(0-9\.0-9\.0-9)] matches any single char other than (, digit, . and ).

You may replace your (?<capability>[^(0-9\.0-9\.0-9)]+) with (?<capability>.*?)-\d{2}\.\d{2}\.\d{2} to get the right value.

enter image description here

Now, the (?<capability>.*?)-\d{2}\.\d{2}\.\d{2} will match any 0+ chars (and capture them into "capability" group) as few as possible (since the *? is a lazy quantifier) up to the first occurrence of -, followed with 2 digits, and then 3 sequences of a dot (\.) followed with 2 digits.

See the regex demo at regex101.com.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563