Questions tagged [jtidy]

JTidy is a Java port of HTML Tidy, a HTML syntax checker and pretty printer. JTidy can be used as a tool for cleaning up malformed and faulty HTML. In addition, JTidy provides a DOM interface to the document that is being processed, which effectively makes you able to use JTidy as a DOM parser for real-world HTML. JTidy is maintained by a group of volunteers.

JTidy is a Java port of HTML Tidy, a HTML syntax checker and pretty printer. Like its non-Java cousin, JTidy can be used as a tool for cleaning up malformed and faulty HTML. In addition, JTidy provides a DOM interface to the document that is being processed, which effectively makes you able to use JTidy as a DOM parser for real-world HTML.

JTidy was written by Andy Quick, who later stepped down from the maintainer position. Now JTidy is maintained by a group of volunteers.

Official Website: http://jtidy.sourceforge.net/

Useful Links:

97 questions
0
votes
1 answer

how to extract data using jtidy and xpath

i have to extract d company name and face value from http://money.rediff.com/companies/20-microns-ltd/15110088 i noticed that this task could be accomplished using xpath api. since this is an html page, i am using jtidy parser. this is the xpath…
Himanshu Soni
  • 264
  • 8
  • 15
0
votes
0 answers

The package org.xml.sax is accessible from more than one module: - need to keep maven dependancy

I am getting the following error in my project using Eclipse and maven. The package org.xml.sax is accessible from more than one module: I previously had the compiler version set to 1.8, but I want it to be at version 11. I found a bunch…
badperson
  • 1,554
  • 3
  • 19
  • 41
0
votes
1 answer

Connect to website using nodes

I'm trying to write a program that will connect to a website, get the source code, look for the tag using nodes. Within that tag there are three "textfields" that I want to input values in, and stream it back to the website. I got so far to…
Foxticity
  • 11
  • 2
0
votes
1 answer

JTidy output in String instead of Document?

I am trying to convert an HTML string to an XHTML string using JTidy to then parse with XMLWorkerHelper. How do I get the output from Tidy in String instead of Document please? My code is: Tidy tidy = new…
Glyn
  • 1,933
  • 5
  • 37
  • 60
0
votes
2 answers

how to set image size while fetching from the web page in java

hi I am fetching the image from the web page using Jtidy in java. This is the my code: URL url = new URL("http://www.yahoo.com"); HttpURLConnection conn=(HttpURLConnection) url.openConnection(); InputStream in=in = conn.getInputStream(); …
DJ31
  • 1,219
  • 3
  • 14
  • 19
0
votes
1 answer

JTidy and XHTML 1.1: is it possible?

I need to transform HTML into XHTML 1.1. I'm doing it in a Java program, so I decided to use JTidy. But if you tell JTidy to transform output in XHTML, you get XHTML 1.0, not XHTML 1.1. I've found some posts on Google about Tidy and XHTML 1.1 from…
robob
  • 1,739
  • 4
  • 26
  • 44
0
votes
2 answers

JTidy preserve CSS rules

Looking for a way to take some html like:
blah blah blah
And run it through…
mtyson
  • 8,196
  • 16
  • 66
  • 106
0
votes
0 answers

style not not being applied when convert html to pdf using Jtidy and Itext in java

I convert document from html to pdf using Jtidy and java the problem that when I convert the pdf , the style was not applied to the document . when I try other solutions (Jsoup , HTMLworker , xmlWorker ) the document was malformed also . …
Ali
  • 1
  • 1
  • 4
0
votes
1 answer

JTidy reports "3 errors were found!"... but does not say what they are

I have a large block of programmatically generated HTML. I ran it through Tidy (version r938) with the following Java code: StringReader inStr = new StringReader(htmlInput); StringWriter outStr = new StringWriter(); Tidy tidy = new…
Paul Brinkley
  • 6,283
  • 3
  • 24
  • 33
0
votes
0 answers

not showing JTiday parsed data in

I am using JSF 2.2 and RichFaces 4.5.1. In one of the rich:popupPanel I am using h:outputText tag to show HTML data that is parsed by JTidy. Data is the response from one of the web services that we are using. JTidy adds CDATA tag in parsed HTML…
0
votes
1 answer

Android SDK and XQuery?

Is there any implementation of XQuery known to work with the Android SDK? I tried mxquery, but had no luck. I did not expect it to work as their site says Andriod support comming soon. I'm unsing jTidy to parse web pages into XHMTL and am looking…
Hell.Bent
  • 1,667
  • 9
  • 38
  • 73
0
votes
1 answer

JTidy Is Wrapping My Paragraphs

I am using JTidy and Flying Saucer to create PDF documents from HTML. I use JTidy to make sure all the elements are clean and formatted correctly before passing the document into Flying Saucer. I have run into an issue with JTidy that I cannot…
decal
  • 987
  • 2
  • 14
  • 39
0
votes
1 answer

Jtidy - How to preserve space between inline elements

My Html source like this

Hello World

The output got like this after conversion(without space):- HelloWorld
Fire Ratz
  • 3
  • 1
0
votes
0 answers

jtidy parsing issue for chinese content

I am facing an issue with the jtidy parser with the following chinese content: 所示回报信息以美元表述,并且用如上所示的股份类别进行计算,已扣除所有基金运营费用,e 未扣除销售 费用。 After parsing it returns an extra e after character "e"…
AnilGoud
  • 45
  • 7
0
votes
1 answer

Formatting snippet of HTML jericho, jTidy or JSoup?

I want to format/indent snippet of HTML String html = "

text1

text2

"; into this

text1

text2

I tried jTidy and JSoup however they adjusts my HTML with and/or or . I…
Dmytro Pastovenskyi
  • 5,240
  • 5
  • 37
  • 56