-1

Is there any reliable method to find out the collection of links which is directed us to detail news page. in other word after visiting the first page of website I just want those links that refer to a news item. any solution ?

gre_gor
  • 6,669
  • 9
  • 47
  • 52
Ali
  • 1

2 Answers2

0

If it is for one certain website, you could always try to fetch the HTML of the website and extract the links to the news articles by using regular expressions. Just find pieces in the HTML that your code can use to identify where the links are.

I did this a couple of times to scrape some info from a website.

But maybe an obvious question, there is no RSS feed available on the website?

Wim Haanstra
  • 5,918
  • 5
  • 41
  • 57
0

You can do a simple WebRequest and download a page and search through the html for the content that you want to parse.

   WebRequest req = WebRequest.Create
              ("http://www.domain.com/news.html");
    req.Proxy = null;
    using (WebResponse res = req.GetResponse())
    using (Stream s = res.GetResponseStream())
    using (StreamReader sr = new StreamReader(s))
        File.WriteAllText("news.html", sr.ReadToEnd());
    //search through html page for news content.

    System.Diagnostics.Process.Start("news.html");
mbcrump
  • 974
  • 3
  • 7
  • 15