i have simple app the gets all the links from web page , im using libexml2 to parse the html and extract the html links that are inside the and Qt QNetworkAccessManager for the http requests . now the problem is how to detecte automatcly the host name of the links if i have for example :
<a href="thelink.html" >
or
<a href="../../../thelink.html" >
or
<a href="../foo/boo/thelink.html" >
i need to convert it to full host path like :
( just example .. )
<a href="http://www.myhost.com/thelink.html" >
or
<a href="http://www.myhost.com/foo/boo/thelink.html" >
or
<a href="http://www.myhost.com/m/thelink.html" >
is there any way to do it programmatically ? without manually doing string manipulation
if you know perl its called : Return a relative URL if possible from the : http://search.cpan.org/~rse/lcwa-1.0.0/lib/lwp/lib/URI/URL.pm
$url->rel([$base])
code example that dosnt work ( Qt ) http://qt.digia.com/support/
QString s("/About-us/");
QString base("http://qt.digia.com");
QString urlForReq;
if(!s.startsWith("http:"))
{
QString uu = QUrl(s).toString();
QString rurl = baseUrl.resolved(QUrl(s)).toString();
urlForReq = rurl;
}
the urlForReq value is "/About-us/"