Questions tagged [html-agility-pack]

HTML Agility Pack is an open-source HTML parser that builds a read/write DOM and supports Linq, plain XPATH or XSLT.

HTML Agility Pack is an open-source HTML parser that builds a read-and-write DOM and supports Linq, plain XPath or XSLT.

It is a .NET code library that allows parsing out of the web HTML files. The parser is very tolerant to malformed HTML. The object model is very similar to what proposes System.Xml, but for HTML documents or streams.

Installing HTML Agility Pack can most easily be done using its NuGet package:

Install-Package HtmlAgilityPack

Latest stable release: 1.11.3 / 18 April 2019

GitHub page: https://github.com/zzzprojects/html-agility-pack

3466 questions

votes

1 answer

XPath Select all children with specific parent node by attribute

I want to select all children i.e images whose parent div with id is testRoot. The structure is unknown. I have simplified it here for understanding purpose. If it is XPath expression, that will be great.

xml xpath html-agility-pack

asked May 24 '17 at 10:48

Idrees Khan

7,702
18
63
111

votes

1 answer

Select only items in a specific DIV using HtmlAgilityPack

I'm trying to use the HtmlAgilityPack to pull all of the links from a page that are contained within a div declared as

However, when I use the code below I simply get ALL links on the entire page. This doesn't really make…

c# html-agility-pack

asked May 20 '10 at 15:38

Adam Haile

30,705
58
191
286

votes

2 answers

Get Links in class with html agility pack

There are a bunch of tr's with the class alt. I want to get all the links (or the first of last) yet i cant figure out how with html agility pack. I tried variants of a but i only get all the links or none. It doesnt seem to only get the one in the…

c# html-agility-pack

asked May 18 '10 at 13:55

user34537

votes

1 answer

Parsing HTML to get script variable value

I'm trying to find a method of accessing data between tags returned by a server I am making HTTP requests to. The document has multiple tags, but only one of the tags has JavaScript code between it, the rest are included from files. I want to…

c# javascript html-agility-pack

asked Aug 09 '13 at 22:48

James Jeffery

12,093
19
74
108

votes

1 answer

HTMLagilitypack is not removing all html tags How can I solve this efficiently?

I am using following method to strip all html from the string: public static string StripHtmlTags(string html) { if (String.IsNullOrEmpty(html)) return ""; HtmlAgilityPack.HtmlDocument doc = new…

c# string html-agility-pack

asked Jun 01 '13 at 17:53

Obsivus

8,231
13
52
97

votes

4 answers

remove html node from htmldocument :HTMLAgilityPack

In my code, I want to remove the img tag which doesn't have src value. I am using HTMLAgilitypack's HtmlDocument object. I am finding the img which doesn't have src value and trying to remove it.. but it gives me error Collection was…

c# collections iteration html-agility-pack dom

asked Aug 24 '12 at 09:05

Priya

1,375
8
21
45

votes

1 answer

HtmlAgilityPack - get all nodes in a document

i would like to traverse all nodes in a documnet using HtmlAgilityPack will foreach (HtmlNode node in myhtml.DocumentNode.SelectNodes("//@")) do?

c# xpath html-agility-pack

asked Feb 02 '12 at 15:22

kiki

votes

2 answers

Extracting Inner text from HTML BODY node with Html Agility Pack

Need a bit of help with HTML Agility Pack! Basically I want to grab plain-text withing the body node of the HTML. So far I have tried this in vb.net and it fails to return the innertext meaning no change is seen, well atleast from what I can…

c# html vb.net html-agility-pack

asked Jul 27 '11 at 22:49

KJSR

1,679
6
28
51

votes

1 answer

HTML Agility pack create new HTMLNode

I'm using HTML Agility Pack to parse and transform a HTML file, but I get an exception "Item has already been added" when try to create a new HTMLNode because of the index parameter. HtmlNode node1 = new HtmlNode(HtmlNodeType.Element, doc, 0);…

html parsing indexing html-agility-pack

asked Mar 15 '11 at 09:50

Diogo Cardoso

21,637
26
100
138

votes

5 answers

Parsing html with the HTML Agility Pack and Linq

I have the following HTML (..) Test1 Data Data 2 Test2 Data2 Data 2…

c# linq html-parsing html-agility-pack

asked Jan 06 '11 at 15:53

Timo Willemsen

8,717
9
51
82

votes

2 answers

HTML Agility Pack - using XPath to get a single node - Object Reference not set to an instance of an object

this is my first attempt to get an element value using HAP. I'm getting a null object error when I try to use InnerText. the URL I am scraping is :- http://www.mypivots.com/dailynotes/symbol/659/-1/e-mini-sp500-june-2013 I am trying to get the…

xpath html-agility-pack

asked Apr 05 '13 at 05:52

dontpanic

votes

3 answers

login to website using HTMLAgilityPack

In the below code, I can set the value of the username and password using the HTMLAgilitypack but I cannot invoke the click event of the login button (the id in the source code of the button is "s1"). Is there anyway for this to be done? The reason…

c# authentication html-agility-pack login-script

asked Nov 26 '12 at 16:26

touyets

1,315
6
19
34

votes

3 answers

Get a value of an attribute by XPath and HtmlAgilityPack

I have a HTML document and I parse it with XPath. I want to get a value of the element input, but it didn't work. My Html: …

c# xpath html-agility-pack

asked Dec 29 '11 at 10:47

Chani Poz

1,413
2
21
46

votes

2 answers

How to strip comments from HTML using Agility Pack without losing DOCTYPE

I am trying to remove unnecessary content from HTML. Specifically I want to remove comments. I found a pretty good solution (Grabbing meta-tags and comments using HTML Agility Pack) however the DOCTYPE is treated as a comment and therefore removed…

html-agility-pack

asked Jul 04 '11 at 05:06

desautelsj

3,587
4
37
55

votes

7 answers

Selecting attribute values with html Agility Pack

I'm trying to retrieve a specific image from a html document, using html agility pack and this xpath: //div[@id='topslot']/a/img/@src As far as I can see, it finds the src-attribute, but it returns the img-tag. Why is that? I would expect the…

c# .net xpath html-agility-pack

asked Feb 12 '09 at 15:57

Vegar

12,828
16
85
151

Prev 1 2 3

…

99 100 Next