-1

I guess that this has had to be asked before, but cannot find anything about it. I also think that maybe the answer is just right there but I can't see it either.

So, if QRegularExpression::match() has not a match, how do I know the position of the character that made the validation fail?

I'm pretty sure that internally, there should be some variable storing the "current position" as the string is being evaluated.

Yes, maybe there is backtracking in that evaluation so if the exact failing char is hard to get, at least the last good one could be easier.

Any hints? Thank you.

Edit (2022-08-08):

I'm starting to feel like it's possible that no one asked this before, in fact, considering how people think I am asking something like "why my regex does not work". Not my case.

This is not about a particular regular expression. It's about Qt's class QRegularExpression.

I apologize if I've not been clear. I've tried to explain the best I could since the very beginning.

Anyway, let's say you have one string, to be evaluated against some (ANY) regex. No match is found. Then I want to know, if possible, the point where the evaluation failed.

This regex: "abc" This string: "abd", failing position: 2

This regex: "abc" This string: "acb", failing position: 1

This regex: "abc" This string: "xyz", failing position: 0

I feel very stupid asking this, mostly because I think it's a very basic question.

But it's not what you immediately think at first glance. I swear I searched for answers the most I could, but everything I got was about errors in the regexes themselves.

Alvein
  • 167
  • 1
  • 9
  • 3
    Rather than criticize the downvoter (not me by the way) you might want to clarify the problem you're trying to solve by at least providing an example regex and search text -- a [mcve] would be nice. – G.M. Aug 07 '22 at 07:49
  • You can test your regex on this site: https://regex101.com/#pcre. There you can see why it dowsnt work. You might also take a look at [QRegualarWxpresion::erorString()](https://doc.qt.io/qt-6/qregularexpression.html#errorString) – Tim Aug 08 '22 at 09:51
  • 1
    @Tim Gromeyer, I live in regex101.com and doc.qt.io. Thanks. Your comment is the right example of people thinking I am asking about "the validity of the regular expression" (just like the docs say). Is not that. Not at all. – Alvein Aug 08 '22 at 15:51

1 Answers1

1

I hate this, but it works.

int getFailingPosition(QString sRegEx,QString sText) {
    int                     iResult;
    QRegularExpression      rxRegEx;
    QRegularExpressionMatch rxmMatch;
    rxRegEx.setPattern(QRegularExpression::anchoredPattern(sRegEx));
    for(iResult=sText.length();iResult>0;iResult--) {
        rxmMatch=rxRegEx.match(sText);
        if(rxmMatch.hasMatch())
            break;
        else {
            rxmMatch=rxRegEx.match(
                sText,
                0,
                QRegularExpression::MatchType::PartialPreferCompleteMatch
            );
            if(rxmMatch.hasPartialMatch())
                break;
        }
        sText.chop(1);
    }
    return iResult;
}

Tests:

#define REGEX_USA_ZIPCODE   "\\d{4}?\\d$|^\\d{4}?\\d-\\d{4}"
#define REGEX_SIGNED_NUMBER "[-+]?[0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?"
#define REGEX_ISO8601_DATE  "\\d{4}-(0[1-9]|1[012])-(0[1-9]|[12]\\d|3[0-1])"
#define REGEX_USA_PHONE     "\\(?\\d{1,3}?\\)?[-.\\s]?\\d{1,4}[-.\\s]?\\d{1,4}[-.\\s]?\\d{1,9}"

    qDebug() << getFailingPosition("abc","abcd"); // 3
    qDebug() << getFailingPosition("abc","abd");  // 2
    qDebug() << getFailingPosition("abc","acb");  // 1
    qDebug() << getFailingPosition("abc","xyz");  // 0
    qDebug() << getFailingPosition("abc","x");    // 0
    qDebug() << getFailingPosition("abc","");     // 0
    qDebug() << getFailingPosition("abc","a");    // 1
    qDebug() << getFailingPosition("abc","ab");   // 2
    qDebug() << getFailingPosition(REGEX_USA_ZIPCODE,"12345-1");      // 7 (missing chars)
    qDebug() << getFailingPosition(REGEX_SIGNED_NUMBER,"-0.123e");    // 7 (missing chars)
    qDebug() << getFailingPosition(REGEX_ISO8601_DATE,"2021-23-31");  // 5 (unexpected char)
    qDebug() << getFailingPosition(REGEX_USA_PHONE,"202-3(24)-3000"); // 5 (unexpected char)

getFailingPosition() should be called only after we're sure there is not a match, or it would return the string length, giving the wrong idea that something's missing.

This should have a built-in function...

Alvein
  • 167
  • 1
  • 9