I've seen several questions similar, even one i posted myself, but this is rather specific.
In regex there is a match pattern. Now say in the same string there are two match patterns that can both match text. It seems my luck always leans towards the regex matching the wrong pattern. (I am using the .Net Regex in C#)
I have two types of strings that I need to break down:
01 - First Value|02 - Second Value|Blank - Ignore
And:
A - First ValueblankB - Second ValueC - Third Value
So my desired result is to match Code to Meaning with one pattern string
Code,Meaning
01,First Value
02,Second Value
Blank,Ignore
A,First Value
blank,
B,Second Value
C,Third Value
I have tried several patterns but can never seem to quite get it right. The closest I have have been able to get is:
(([A-Z0-9]{1,4})[ \-–]{1,3}|([Bb]lank)[ \-–]{0,3})(([A-Z][a-z]+[.,;| ]?)+)
My breakdown:
[A-Z0-9]{1,4}[ \-–]{1,3}
--> this matches the code, Upper case, or number of length 1 - 4 characters followed by 1 to 3 chars of space, hyphen, or mdash from html.
or
[Bb]lank[ \-–]{0,3}
--> blank followed 0-3 chars of space, hyphen, or mdash from html
then
(([A-Z][a-z]+[.,;| ]?)+)
--> should match any multiple word including possible space. so the First and Value, Second and Value should be matched.
The initial problem with that is the final pattern group matches the "Valueblank" in the second input string. I want to somehow prioritize that "[Bb]lank" should be matched as part of the first group and NEVER part of the second group.
I tried putting a (?![Bb]lank)
negative lookahead in the finalgroup but it never seems to work. Any help would be appreciated.
Thanks
Jaeden "Sifo Dyas" al'Raec Ruiner