3

I am using Html Agility Pack to get the info about each product on the page:

My code is this but the node is returning null.I am using the Xpath found using Google Chrome.

private void getDataBtn_Click(object sender, EventArgs e)
    {
        if (URL != null)
        {
            HttpWebRequest request;
            HttpWebResponse response;
            StreamReader sr;

            List<string> Items = new List<string>(50);
            HtmlAgilityPack.HtmlDocument Doc = new HtmlAgilityPack.HtmlDocument();

            request = (HttpWebRequest)WebRequest.Create(URL);
            response = (HttpWebResponse)request.GetResponse();
            sr = new StreamReader(response.GetResponseStream());

            Doc.Load(sr);

            var Name = Doc.DocumentNode.SelectSingleNode("/html/body/table[2]/tbody/tr/td[2]/table/tbody/tr[2]/td/table/tbody/tr[2]/td[2]/table/tbody/tr[1]/td[3]/a");
        }
    }

What am I doing wrong? Is there any other tool which can create agility pack compatible xpath expressions?

Filip Roséen - refp
  • 62,493
  • 20
  • 150
  • 196
  • i got your point ...the problem is that the webpage loads some content dynamically so. when u try to fetch the page the dynamic content not loaded and that's why you getting null value..please see my question..have same problem. http://stackoverflow.com/questions/18955793/html-agility-pack-not-loading-the-page-with-full-content – BhavikKama Sep 24 '13 at 09:42

1 Answers1

0

Because there is no such node in this page. when you download it by agility pack (not by the browser) the page has this text:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>HobbyKing Page not found.</title>
</head>

<body>
<img src="http://www.hobbyking.com/hk_logo.gif"><br>
<span style="font-family:Verdana, Arial, Helvetica, sans-serif">
<strong>Page no longer available</strong><br>
It seems you have landed on a page that doesnt exist anymore.<br>
Please update your links to point towards the correct hobbyking.com location;<br>
<a href="http://www.hobbyking.com/hobbyking/store/uh_index.asp">http://www.hobbyking.com/hobbyking/store/uh_index.asp</a><br>
<br>
If you continue to see this message, please email <a href="mailto:support@hobbyking.zendesk.com">support@hobbyking.zendesk.com</a></span>
</body>
</html>

You can see in the page the following sentences:
"Please update your links to point towards the correct hobbyking.com location;
http://www.hobbyking.com/hobbyking/store/uh_index.asp"

p.s. you can see it by checking it in the debug of visual-studio.

Chani Poz
  • 1,413
  • 2
  • 21
  • 46
  • Well you should use HttpResponse and Request to get the source code.Btw after this everything is coming null /html/body/table[2] –  Jul 19 '12 at 17:28