HTML Parser is a Java HTML parsing library. It features filters, visitors, custom tags and easy to use JavaBeans.
Questions tagged [html-parser]
211 questions
4
votes
2 answers
How to parse a html table data through table id using JSoup in java
I need to store my client's table data into database.
There are n number of tables for which they have not provided any table class (directly using just Table_id in web page).
Example:
[table width="100%" border="0" cellpadding="0" cellspacing="0" …

identify
- 75
- 1
- 8
4
votes
2 answers
How to add a new div tag in an existing html file after a h1 tag using python
I have a html file and i want to add a div tag after h1 tag. the div tag will have a anchor tag. how can i edit the existing html file using python and add the div with link
this is what i want to do
i tried with…

learner
- 1,041
- 3
- 15
- 31
3
votes
0 answers
htmlparser2 how to replace a tag by another custom tag with the same attributes that has the tag one
I need to change a tag for another tag with the same properties. For example change this:
asdfasdf
to this:
sometext
With this code I can see the…

J. Alan
- 51
- 2
3
votes
1 answer
How do I sequentially extract English text from different web pages of a Tamil website?
The Naalayira Divya Prabandham is a 4000-verse collection of Hindu poems written in the Tamil language. The website http://dravidaveda.org has a web page for each of the 4000 verses. Each verse page gives the Tamil verse, a word-by-word Tamil…

Keshav Srinivasan
- 131
- 2
3
votes
1 answer
DOM Tree to HTML in node
I am really new to node. I was working on REST calls. I get a request from Postman(using it to check REST api calls) with a URL. I need to make a few word level changes on the contents of that URL
e.g if the url is received is…

Divya kukar
- 31
- 3
3
votes
3 answers
Get the href innertext with HtmlAgilityPack
I am trying to create a news agent to get the news from the websites.so i have to use a html parser like HtmlAgilityPack .so here you ca see my code :
public async void parsing(string website)
{
HttpClient http = new HttpClient();
var…

Ehsan Akbar
- 6,977
- 19
- 96
- 180
3
votes
2 answers
Python: Extracting specific data with html parser
I started using the HTMLParser in Python to extract data from a website.
I get everything I wanted, except the text within two tags of HTML.
Here is an example of the HTML tag:

IssnKissn
- 81
- 1
- 1
- 6
2
votes
3 answers
HTML Parser for response - Java
Im using HttpClient to access a particualr website and the response i get is in the form of an HTML. Which parser or method I should use the parser the HTML and get what I want from the response.
Note: Im using HttpClient with Java

Geek
- 3,187
- 15
- 70
- 115
2
votes
1 answer
Issue with BeautifulSoup: Some Image URLs Returning as None Despite `src` Attribute Presence
I am using BeautifulSoup to extract image URLs from an HTML structure in Python. The HTML structure contains several
tags with the src attribute. I've implemented the _get_images function, which uses BeautifulSoup's find_all("img") method to…

Elie Hacen
- 372
- 12
2
votes
2 answers
Creating my own html parser
I know this post, I've already read it but still I'd like to learn what language does an html parser (may) use? I mean, does it parse the whole source with a regex or it uses a normal programming language such as c# or python?
Apart from the…

Shaokan
- 7,438
- 15
- 56
- 80
2
votes
1 answer
How to parse nested table from HTML link using BeautifulSoup in Python?
All,
I am trying to Parse table from this link http://web1.ncaa.org/stats/StatsSrv/careersearch.
Please Note: For searching under "School/Sport Search" select All for School, Year -2005-2006, Sport -Football, Division I. The column I am trying to…

Data_is_Power
- 765
- 3
- 12
- 30
2
votes
3 answers
python class variable reset?
I am having this issue now, so I have a HTMLParser using HTMLParser library class like this
class MyHTMLParser(HTMLParser):
temp = ''
def handle_data(self, data):
MyHTMLParser.temp += data
I need the temp variable because I need to…

Anna
- 443
- 9
- 29
2
votes
2 answers
html search and replace preserving html tags
I'm looking for a Java based html parser which can search and replace text preserving html tags. This question has been asked here before but the answers seems to be not hitting the target. There are few html parsers which I downloaded and wrote…

user576249
- 31
- 2
2
votes
0 answers
Crash at parser.py when attempting to output data as a text file .. no idea why
I'm currently trying to port an older python application to version 3. In the process of doing so, the application now crashes when attempting to output data to a text file:
parser.py, line 110
AttributeError("'HTMLPlusParser' object has no…

artomason
- 3,625
- 5
- 20
- 43
2
votes
1 answer
HTML parser import issues
So I'm trying to make a web crawler in python using HTMLParser and urllib3 in python. Currently I have two different import problems the first being
import html.parser
import urllib
urlText = []
#Define HTML Parser
class…

David A
- 41
- 1
- 5