2

I have a problem with a regex command,

I have a file with a tons of lines and with a lot of sensitive characters,

this is an Example with all sensitive case 0123456789/*-+.&é"'(-è_çà)=~#{[|`\^@]}²$*ù^%µ£¨¤,;:!?./§<>AZERTYUIOPMLKJHGFDSQWXCVBNazertyuiopmlkjhgfdsqwxcvbn

I tried many regex commands but never get the expected result, I have to select everything from Example to the end


I tried this command on https://www.regextester.com/ :

\sExample(.*?)+

Image of the result here

And when I tried it in C# the only result I get was : Example

I don't understand why --'

PierreDuv
  • 104
  • 1
  • 8

1 Answers1

2

Here's a quick chat about greedy and pessimistic:

Here is test data:

Example word followed by another word and then more

Here are two regex:

Example.*word
Example.*?word

The first is greedy. Regex will match Example then it will take .* which consumes everything all the way to the END of the string and the works backwards spitting a character at a time back out, trying to make the match succeed. It will succeed when Example word followed by another word is matched, the .* having matched word followed by another (and the spaces at either end)

The second is pessimistic; it nibbled forwards along the string one character at a time, trying to match. Regex will match Example then it'll take one more character into the .*? wildcard, then check if it found word - which it did. So pessimistic matching will only find a single space and the full match in pessimistic mode is Example word

Because you say you want the whole string after Example I recommend use of a greedy quantifier so it just immediately takes the whole string that remains and declares a match, rather than nibbling forwards one at a time (slow)

This, then, will match (and capture) everything after Example:

\sExample(.*)

The brackets make a capture group. In c# we can name the group using ?<namehere> at the start of the brackets and then everything that .* matches can be retrieved with:

Regex r = new Regex("\sExample(?<x>.*)");
Match m = r.Match("Exampleblahblah");
Console.WriteLine(m.Groups["x"].Value); //prints: blahblah 

Note that if your data contains newlines you should note that . doesn't match a newline, unless you enable RegexOptions.SingleLine when you create the regex

enter image description here

Caius Jard
  • 72,509
  • 5
  • 49
  • 80