How can I change HTML content of tag in Java? For example:
before:
<html>
<head>
</head>
<body>
<div>text<div>**text**</div>text</div>
</body>
</html>
after:
<html>
<head>
</head>
<body>
<div>text<div>**new text**</div>text</div>
</body>
</html>
I tried JTidy, but it doesn't support getTextContent
. Is there any other solution?
Thanks, I want parse no well-formed HTML. I tried TagSoup, but when I have this code:
<body>
sometext <div>text</div>
</body>
and I want change "sometext" to "someAnotherText," and when I use {bodyNode}.getTextContent()
it gives me: "sometext text"; when I use setTextContet("someAnotherText"+{bodyNode}.getTextContent())
, and serialize these structure, the result is <body>someAnotherText sometext text</body>
, without <div>
tags. This is a problem for me.