2

I am trying to match google urls from some text that is stored in a variable, using the pattern below.

The urls use double quotes

QRegExp regExp;
regExp.setPattern("http://www.google.com/(.*)");

I manage to match the url but it unwontedly matches all of the text that is contained after it. I have tried using similar variants like the ones below, but they don't seem to work.

regExp.setPattern("http://www.google.com/(.*)\"is"); 
regExp.setPattern("http://www.google.com/^(.*)$\"");

Any help to get a regular expression that matches just the url alone.

Thanks in advance

user866190
  • 855
  • 5
  • 14
  • 31

2 Answers2

1

Is there a reason you need/want to use a QRegExp?

You could use a QUrl most likely.

Eric Hulser
  • 3,912
  • 21
  • 20
  • Thanks for the suggestion, I'll look into it, because RegExp are quite cryptic to look at – user866190 Sep 05 '12 at 18:22
  • I prefer this solution, to get the portion of the URL that you're matching with that regex you'd just do something like `QUrl(url_string).path()` – Chris Sep 05 '12 at 19:33
0

Even though it is impossible for us to know what is around the urls in your text (quotes ? parenthesis ? white spaces ?), we can create a better regular expression by trying to do a negative match of characters that cannot be part of the url:

QRegExp regExp;
regExp.setPattern("http://www.google.com/([^()\"' ]*)");

Then you just need to add more possible characters to this negative character class.

SirDarius
  • 41,440
  • 8
  • 86
  • 100