How to get this regular expression

Question

I have this piece of code:

<a href="http://www.fnac.pt/Memorial-do-Convento-Texto-em-Analise-Varios/a242166" class="fontbigger">Memorial do Convento - Texto em Análise</a>

...and I want to get this part:

Memorial do Convento - Texto em Análise

How do I do it? I have tried this:

<a href="[^<]+" class=".+">(.+)</a>

...but the first [^<] doesn't work cause it recognizes only this:

http://www.fnac.pt/Memorial-do-Convento-Texto-em-Analise-Varios/

"Lazy" answer: >.*< You just have to remove < and >. The part where you want to leave this '-' between convento and texto - this is hard part. In your html You could mark this specific dash as diffrent character (fe '~'). And then in java strings methods remove any dashes but this one, and at the end change ~ to -. — Michał M, Jun 27 '16 at 13:11
You might be better off using the `split` function using a `/` as delimiter. Then you could select the 2nd element in the resulting array (0th element is the 1st element). — Matt Cremeens, Jun 27 '16 at 13:18

score 0 · Answer 1 · answered Jun 27 '16 at 13:42

0

You can use captured with this regex:

QRegularExpression regex("<a.*?>(.*?)<\\/\\s*a\\s*>", QRegularExpression::MultilineOption);
QRegularExpressionMatch match = regex.match(resultHTML);
QString output = match.captured(1);

answered Jun 27 '16 at 13:42

Thomas Ayoub

29,063
15
95
142

arhr · Answer 2 · 2016-06-27T14:53:57.813

0

Tried out this regex and it worked

>([^<>]+)<

However parsing HTML with regex isn't the best option

edited Jun 27 '16 at 14:53

answered Jun 27 '16 at 13:49

arhr

1,505
8
16

it worked? it didn't... yes, i understand that its not the best option but it is how the teacher wants... – user6236820 Jun 27 '16 at 13:56
https://regex101.com/r/iH5wG0/1 Don't forget the () is a capturing group that you need to access afterwards – arhr Jun 27 '16 at 14:05
ah okay, and how would you get the link? `http://www.fnac.pt/Memorial-do-Convento-Texto-em-Analise-Varios/a242166` – user6236820 Jun 27 '16 at 15:00

How to get this regular expression

2 Answers2