0

I am attempting to use Qt to execute a regex in my C++ application. I have done similar regular expressions with Qt in C++ before, but this one is proving difficult.

Given a string with optional _# at the end of the string, I want to extract the part of the string before that.

Examples:

"blue_dog" should result "blue_dog"
"blue_dog_1" should result "blue_dog"
"blue_dog_23" should result "blue_dog"

This is the code I have so far, but it does not work yet:

QString name = "blue_dog_23";
QRegExp rx("(.*?)(_\\d+)?");    
rx.indexIn(name);
QString result = rx.cap(1);  

I have even tried the following additional options in many variations without luck. My code above always results with "":

rx.setMinimal(TRUE);   
rx.setPatternSyntax(QRegExp::RegExp2);
panofish
  • 7,578
  • 13
  • 55
  • 96

2 Answers2

1

Sometimes it's easier not to pack everything in a single regexp. In your case, you can restrict manipulation to the case of an existing _# suffix. Otherwise the result is name:

QString name = "blue_dog_23";
QRegExp rx("^(.*)(_\\d+)$");
QString result = name;
if (rx.indexIn(name) == 0)
    result = rx.cap(1);

Alternatively, you can split the last bit and check if it is a number. A compact (but maybe not the most readable) solution:

QString name = "blue_dog_23";
int i = name.lastIndexOf('_');
bool isInt = false;
QString result = (i >= 0 && (name.mid(i+1).toInt(&isInt) || isInt)) ? name.left(i) : name;
Tim Hoffmann
  • 1,325
  • 9
  • 28
0

The following solution should work as you want it to!

^[^\s](?:(?!_\d*\n).)*/gm

Basically, that is saying match everything up to, but not including, _\d*\n. Here, _\d*\n means match the _ char, then match any number of digits \d* until a new line marker, \n is reached. ?! is a negative lookahead, and ?: is a non-capturing group. Basically, the combination means that the sequence after the ?: is the group representing the non-inclusive end point of the what should be captured.

The ^[^\s] tells the expression to match starting at the start of a line, as long as the first character isn't a white space.

The /gm sets the global flag (allowing more than one match to be returned) and the mutli-line flag (which allows sequences to match past a single line.

Arthur-1
  • 265
  • 1
  • 7