7

If I have a string that contains the html from a page I just got returned from an HTTP Post, how can I turn that into something that will let me easily traverse the DOM?

I figured HtmlDocument object would make sense, but it has no constructor. Are there any types that allow for easy management of HTML DOM?

Thanks,
Matt

Sky Sanders
  • 36,396
  • 8
  • 69
  • 90
Matt
  • 5,547
  • 23
  • 82
  • 121

1 Answers1

11

The HtmlDocument is an instance of a document that is already loaded by a WebBrowser control. Thus no ctor.

Html Agility Pack is by far the best library I have used to this purpose

An example from the codeplex wiki

HtmlDocument doc = new HtmlDocument();
doc.Load("file.htm");
foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href]"))
{
    HtmlAttribute att = link["href"];
    att.Value = FixLink(att);
}
doc.Save("file.htm");

The example shows loading of a file but there are overloads that let you load a string or a stream. 

carla
  • 1,970
  • 1
  • 31
  • 44
Sky Sanders
  • 36,396
  • 8
  • 69
  • 90