5

how can we convert html to well formed xhtml by using Http class api,if possible please give a demonstration code....thanks

javanna
  • 59,145
  • 14
  • 144
  • 125
yagnya
  • 549
  • 1
  • 4
  • 18

3 Answers3

13

I just did it using Jsoup, if it works for you:

private String htmlToXhtml(final String html) {
    final Document document = Jsoup.parse(html);
    document.outputSettings().syntax(Document.OutputSettings.Syntax.xml);
    return document.html();
}

Some useful content where my solution came from:

Community
  • 1
  • 1
Vitor Pelizza
  • 306
  • 2
  • 8
4

Have a look at J-Tidy: http://jtidy.sourceforge.net/ It usually does a quite good job cleaning up messy html and converting it to xhtml.

mglauche
  • 3,344
  • 4
  • 28
  • 31
  • See examples at http://stackoverflow.com/questions/15063870/jtidy-java-api-toconvert-html-to-xhtml – Vadzim Jul 22 '16 at 17:34
0

You can use the following method to get xhtml from html

public static String getXHTMLFromHTML(String inputFile,
            String outputFile) throws Exception {

        File file = new File(inputFile);
        FileOutputStream fos = null;
        InputStream is = null;
        try {
            fos = new FileOutputStream(outputFile);
            is = new FileInputStream(file);
            Tidy tidy = new Tidy(); 
            tidy.setXHTML(true); 
            tidy.parse(is, fos);
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }finally{
            if(fos != null){
                try {
                    fos.close();
                } catch (IOException e) {
                    fos = null;
                }
                fos = null;
            }
            if(is != null){
                try {
                    is.close();
                } catch (IOException e) {
                    is = null;
                }
                is = null;
            }
        }

        return outputFile;
    }
  • If you declare `outputFile` as parameter you don't have to `return` it in addition. Java's parameter passing is by-value and for reference types, like `String`, this value is the reference to the object (on the heap) created, and passed, by the caller. That means, changes to it inside the function are also visible to the caller once the function has ended. (Apart from inventing this wheel again when libraries exist that do that for us; see the other answers.) – Gerold Broser Sep 25 '21 at 12:36