Questions tagged [html-agility-pack]

HTML Agility Pack is an open-source HTML parser that builds a read/write DOM and supports Linq, plain XPATH or XSLT.

HTML Agility Pack is an open-source HTML parser that builds a read-and-write DOM and supports Linq, plain XPath or XSLT.

It is a .NET code library that allows parsing out of the web HTML files. The parser is very tolerant to malformed HTML. The object model is very similar to what proposes System.Xml, but for HTML documents or streams.

Installing HTML Agility Pack can most easily be done using its NuGet package:

Install-Package HtmlAgilityPack

Latest stable release: 1.11.3 / 18 April 2019

GitHub page: https://github.com/zzzprojects/html-agility-pack

3466 questions
1
vote
1 answer

Adding a Stylesheet with the HAP

I have a stylesheet stored as a string which I'm trying to add to a parsed HtmlDocument using the Html Agility Pack. I can't set the InnerText of a style node because it has no setter. What's the best way of doing this?
Echilon
  • 10,064
  • 33
  • 131
  • 217
1
vote
1 answer

Setting HtmlAgilityPack to work with Mono for Android

I've asked a similar question before, but nobody answered. How do I set HtmlAgilityPack to work with Mono for Android? I've added reference to .dll but when trying to use HtmlDocument I get error that System.Xml version 2.0.0.0 or 4.0.0.0 is…
fanboy555
  • 291
  • 2
  • 7
  • 20
1
vote
3 answers

html to c# object, recursive function?

I'm trying to convert a html document to c# object. I have a example list of names in an ordered list as below. I am using Html Agility Pack.
  1. Heather
  2. Channing
  3. Briana
  4. Amber
Abu Ragneesh
  • 63
  • 1
  • 5
1
vote
1 answer

HtmlAgilityPack - Remove child nodes but retain inner text for the main node

I am trying to get the inner text from a node but it has child nodes and its text is in the middle of its child entries i.e: lalala "script text" The code I need is inside script1, but if I try and get innertext I…
Aaron Gibson
  • 1,280
  • 1
  • 21
  • 36
1
vote
2 answers

Get data using HAP (HTML Agility Pack) From Page

A continuation of this post, I am trying to parse out some data from an HTML page. Here is the HTML (there is more info on the page, but this is the important section):
Sugitime
  • 1,818
  • 4
  • 23
  • 44
1
vote
1 answer

HtmlAgilityPack Multiple Types of Descendants

I am trying to select a divs, spans, labels, etc basically any element with a certain attribute. IEnumerable allDivsWithItemType = _doc.DocumentNode.Descendants("div").Where(d => d.Attributes.Contains("itemtype")); Is there a way to rope…
Adam
  • 3,615
  • 6
  • 32
  • 51
1
vote
1 answer

Html Agility Pack unclosed embed, param tags

A try to parse embed-object tag like this: HtmlNode source2 = HD.CreateElement("source"); source2.InnerHtml =
ngi
  • 378
  • 1
  • 4
  • 16
1
vote
1 answer

select an element next to current element HtmlAgilityPack

I'm using HtmlAgilityPack to parsing html page. I want to select a collection of tag h3 then loop through it, and for each h3 element, i want to select a element right next to it. Here is my sample Html:

Somthing here

    list of…
Doan Cuong
  • 2,594
  • 4
  • 22
  • 39
1
vote
1 answer

HtmlAgilityPack substring of all by length

I have html with nested elements (mostly just div and p elements) I need to return the same html, but substring'ed by a given number of letters. Obviously the letter count should not enumerate through html tags, but only count letters of InnerText…
Dmitry Efimenko
  • 10,973
  • 7
  • 62
  • 79
1
vote
2 answers

Html Agility Pack creating irrelevant characters on save html file in c#

I am working on project using asp.net mvc3 C# . I want to change some html element attributes by c# like width , height etc. I have a simple (_Layout.cshtml) file
Shoaib Ijaz
  • 5,347
  • 12
  • 56
  • 84
1
vote
2 answers

How to get text between two div tags with some class attribute with HTMLagility

I want to get some text from two html div from HTML file. After some searches i decided to use HTMLAgility Pack for doing this. I wrote this code : HtmlDocument doc = new HtmlDocument(); doc.LoadHtml(result); HtmlNode node =…
Saman Gholami
  • 3,416
  • 7
  • 30
  • 71
1
vote
0 answers

HtmlAgilityPack used with linq to select nodes

I basically try to use htmlagilitypack to parse a piece of html and use linq to put it into object for uses with other piece of my code. I've got below two code snippets where #1 uses linq and does not work, but #2 uses a for loop and works. Two…
user915383
  • 11
  • 1
1
vote
3 answers

WinRT web page parse / DocumentNode.InnerHtml = "URI" rather than page html

I'm trying to create a metro application with schedule of subjects for my university. I use HAP+Fizzler for parse page and get data. Schedule link give me @Too many automatic redirections@ error. I found out that CookieContainer can help me, but…
1
vote
0 answers

Phrase Html Document for node font size

Is there any way to phrase(read) the html node font size. It is in C# and I try checking out the htmlagilitypack but can't find anything. I am using C# to extract information from webpage and need to know what font/style it have. Thanks for help.
1
vote
3 answers

Select value from HTML with HTMLAgilityPack

I need some help on how to extract a value from some HTML using the HTML Agility Pack. The (partial) HTML is:
Kevin Appleyard
  • 149
  • 2
  • 8
1 2 3
99
100