-1
    <div class="logoDesc">


Gnb Road, Chandmari, Guwahati - 781003




    |
                <a href="http://www.justdial.com/Guwahati/Kiran-Mistanna-Bhandar-&lt;near&gt;-Chandmari/9999PX361-X361-1230284509G9V5B2-DC_R3V3YWhhdGkgQmFjaGVsb3IgQ2FrZQ==_BZDET/map">
                    View Map</a><br>
                <p>
                    <span class="Gray">Call: </span><span style="color: #424242; font-size: 12px;">+(91)-9954843180</span>
                    <span style="color: #424242;">|</span> <a href="http://contest.justdial.com/contest/register.php?utm_source=rsbnr&amp;utm_medium=banner&amp;cont_ref=rsbnr"
                        style="font-size: 12px; display: inline-block;" onclick="_ct('Win Ipad2','ltpg');"
                        target="_blank"><b>Win iPad2</b></a>
                </p>
                <p>
                    <span class="Gray">Also See :</span> <b>Cake Shops</b>, <a href="http://www.justdial.com/Guwahati/Bakeries/ct-10033880">
                        Bakeries</a>, <a href="http://www.justdial.com/Guwahati/Confectionery-Retailers/ct-10127628">
                            Confectionery Retailers</a>
                </p>
            </div>

I am using HTML Agility pack...ii want to extract the address only[between the stars] ..what should be the syntax?? Kindly help.

UPDATE:I am using the following code

Protected Sub Button1_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles Button1.Click
        Dim webGet = New HtmlWeb()
        Dim document = webGet.Load("http://www.justdial.com/Guwahati/Bachelor-Cake/ct-10070075")

        Dim nodes1 = document.DocumentNode.SelectNodes("//*[@class='logoDesc']")

        For Each node In nodes1
            MsgBox(node.InnerText)
        Next node
    End Sub

Using this code snippet i get all the details inside the div...i just want the address.

user1150440
  • 439
  • 2
  • 10
  • 23

2 Answers2

0

No idea on agility pack, but here's a straight up screen scraper:

    string page = Methods.GetPage("http://www.yoururl.com");
    int firstStars = page.IndexOf("***");
    string second = page.Substring(firstStars);
    int secondStars = second.IndexOf("***");

    //Add 3 to skip over the first three stars. May not need the +3, can't recall.
    string address = page.Substring(0 + 3, secondStars);


    public static string GetPage(string url)
    {
        WebClient webClient = new WebClient();
        byte[] reqHTML;
        string page = string.Empty;

        UTF8Encoding objUTF8 = new UTF8Encoding();
        try
        {
            reqHTML = webClient.DownloadData(url);
            page = objUTF8.GetString(reqHTML);
        }
        catch (Exception theex)
        {

        }
        return page;
    }
FAtBalloon
  • 4,500
  • 1
  • 25
  • 33
0

Try this (adding "/text()" to the end of your XPath):

Protected Sub Button1_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles Button1.Click
    Dim webGet = New HtmlWeb()
    Dim document = webGet.Load("http://www.justdial.com/Guwahati/Bachelor-Cake/ct-10070075")
    Dim nodes1 = document.DocumentNode.SelectNodes("//*[@class='logoDesc']/text()")
    For Each node In nodes1
        MsgBox(node.InnerText)
    Next node
End Sub
Steven Doggart
  • 43,358
  • 8
  • 68
  • 105