Questions tagged [html-agility-pack]

HTML Agility Pack is an open-source HTML parser that builds a read/write DOM and supports Linq, plain XPATH or XSLT.

HTML Agility Pack is an open-source HTML parser that builds a read-and-write DOM and supports Linq, plain XPath or XSLT.

It is a .NET code library that allows parsing out of the web HTML files. The parser is very tolerant to malformed HTML. The object model is very similar to what proposes System.Xml, but for HTML documents or streams.

Installing HTML Agility Pack can most easily be done using its NuGet package:

Install-Package HtmlAgilityPack

Latest stable release: 1.11.3 / 18 April 2019

GitHub page: https://github.com/zzzprojects/html-agility-pack

3466 questions
1
vote
1 answer

How to search both class and ID names in HTML code using same xpath query?

I am using the below code to search the class name abc in the HTML code: nodes = doc.DocumentNode.SelectNodes("//*[contains(concat(' ', normalize-space(@class), ' '), ' abc ')]"); Which is giving me correct result. But if I want to search ID name…
user2025463
  • 85
  • 1
  • 9
1
vote
1 answer

HTMLAgilityPack node name filter doesn't work

I want to get the text of a page using HTMLAgilityPack. I have some code for this: HtmlAgilityPack.HtmlWeb TheWebLoader = new HtmlWeb(); HtmlAgilityPack.HtmlDocument TheDocument = TheWebLoader.Load(textBox1.Text); List TagsToRemove = new…
ahmadali shafiee
  • 4,350
  • 12
  • 56
  • 91
1
vote
1 answer

Upload additional files after initial upload

I have an upload script that accepts a .HTML file format. If the HTML text contains tags, I need to upload those image files to the server from the users hard drive. I'm trying to think of the best way to approach this. I am using…
user547794
  • 14,263
  • 36
  • 103
  • 152
1
vote
3 answers

html agility pack remove children

I'm having difficulty trying to remove a div with a particular ID, and its children using the HTML Agility pack. I am sure I'm just missing a config option, but its Friday and I'm struggling. The simplified HTML runs:
Bob
1
vote
1 answer

How to detect all relative URLs within an HTML webpage?

As the question states; is there some way to detect all URLs inside a PHP page if they're relative. And by considering of course that the URLs contained in the PHP Page may appear in different behaviors :
Rafik Bari
  • 4,867
  • 18
  • 73
  • 123
1
vote
1 answer

HtmlAgilityPack, get a sequence of nodes with a label

Imagine an Html document similar to this
...
...

...

...

...

...

...

...

...

...

gpupu
  • 65
  • 8
1
vote
3 answers

Scraping HTML from Google Translate

I want to translate a string using Google Translator. My sample string is "this is my string". I want to use HTML Agility Pack to parse HTML documents. I tried this: using HtmlAgilityPack; ........ var webGet = new HtmlWeb(); var document =…
user1960072
1
vote
1 answer

extracting all iframe-tags using htmlagilitypack

I'm using htmlagilitypack to extract several html-tags. Heres what I do: HtmlDoc = new HtmlDocument(); StringReader sr = new StringReader(decodedHTML); HtmlDoc.Load(sr); sr.close(); var anchor_tags =…
user1826831
  • 735
  • 3
  • 9
  • 17
1
vote
2 answers

Html Agility Pack Same html source from muiltiple pages

Task I'm supposed to create an application that extracts the name of an item from an Amazon.com webpage. Action I thought I would used the Html Agility Pack to get this done, and I think I've got a solution going, but there is one bug that keeps…
myselfesteem
  • 733
  • 1
  • 6
  • 23
1
vote
1 answer

HTML Agilty pack and string parsing?

I have html string like this (yahoo xml description element)
Current Conditions:
Cloudy, 1 C

Forecast:
Mon - Snow. High: -5 Low: -14
Tue - Light…
AliRıza Adıyahşi
  • 15,658
  • 24
  • 115
  • 197
1
vote
1 answer

htmlagilitypack won't parse a table

I am trying to parse a simple data table from the following link: http://www.tase.co.il/TASE/General/Company/companyHistoryData.htm?subDataType=0&companyID=001390&shareID=01100957 You will get the table, clicking the light green submit image on the…
dror a
  • 21
  • 2
1
vote
1 answer

Overcoming ambiguity in HTMLAgility pack (Windows Phone 7)

I am trying to get the InnerText of a particular node with the following xpath /html/body/center/table/tbody/tr[5]/td[3]/font/font/span by using the…
Vignesh PT
  • 624
  • 12
  • 28
1
vote
0 answers

Errors converting HTML to XML or HTML with HtmlAgilityPack

My HtmlAgilityPack version is 1.3.0. My example: StringBuilder sbXml = new StringBuilder(); StringWriter sw = new StringWriter(sbXml); XmlTextWriter tw = new XmlTextWriter(sw); HtmlWeb htmlWeb = new…
1
vote
0 answers

HTMLAgilityPack new selection in for each loop

I have this html code
1 2 3
99
100