Android decoding html in xml file

Question

In my software im receiving a xml file that is containing some HTML entities like & amp; or whatever. Im successfull decoding the xml but not the HTML entities. The strings are cutted when they meet an html entities... Anybody can help ? I have such code actually to decode the xml...

            DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
   DocumentBuilder builder = factory.newDocumentBuilder();
InputStream inputStream = entity.getContent();
Document dom = builder.parse(inputStream);
   inputStream.close();


   Element racine = dom.getDocumentElement();
   NodeList nodeLst=racine.getElementsByTagName("product");

Does anyone know how i can do the same job, decoding the xml as a dom object and also decoding HTML entities ?

Actually my dom object is not correct because its contain some strings that are cutted because of HTML entities... what can i do ?

Can you expand what exactly is in the XML file? Is it, for example, `A&B` or `A&B`? And what do you exactly need as the end result, `A&B` or `A&B`? And what do you mean with "cutting"? — RoToRa, Nov 09 '10 at 10:26

score 1 · Answer 1 · answered Nov 09 '10 at 12:29

1

I have two approaches to suggest:

Deactivate validation: factory.setValidating(false);
Add a XHTML DTD tag to your XML stream, immediately after the <?xml ...> tag.

<?xml version="1.0"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

answered Nov 09 '10 at 12:29

Jean Hominal

16,518
5
56
90

Thanks for your answer. I cannot test it right now because we used another approach and changed the way our server was sending datas to us. I hope this answer may help other people. – Fabien Nov 10 '10 at 13:14
How do deactivate validation when I do getResources().getXml(R.xml.laws) ? – Daniel Ryan Sep 06 '11 at 23:56
1

@Zammbi: I believe you should be able to deactivate validation by using the [`XmlPullParser`](http://developer.android.com/reference/org/xmlpull/v1/XmlPullParser.html) interface and the `setFeature` method. I would suggest asking a new question if you need more information. – Jean Hominal Sep 12 '11 at 08:56

score 1 · Answer 2 · edited Nov 08 '11 at 15:43

1

I think it iss because it detect "'" apostrophe as a final of string. I've founded a solution.

String stringDatosEntrada = new Scanner(urlConnection.getInputStream()).useDelimiter("\\A").next().replaceAll("&amp;#39;","\'").replaceAll("&#39;","\'");

InputStream is = new ByteArrayInputStream(stringDatosEntrada.getBytes());
Document dom = builder.parse(inputStream)

edited Nov 08 '11 at 15:43

Yi Jiang

49,435
16
136
136

answered Nov 08 '11 at 13:40

Charly Baquero

11
1

score 0 · Answer 3 · edited Feb 08 '17 at 14:30

0

You could try using androids Html tag editor. It should do what you want, it doesn't recognise all HTML but it does seem to work to convert strings:

    Html.fromHtml(inputstream)

Here is a simple example:

    TextView tv = (TextView) findViewById(R.id.tv);
    String s = "<b>This is</b> my first <u>HTML String</u> &amp; it works well!";
    tv.setText(Html.fromHtml(s));

Here is the output:

edited Feb 08 '17 at 14:30

Community

1
1

answered Nov 09 '10 at 09:41

Scoobler

9,696
4
36
51

I know about this function, thanks. But its cannot help as my dom object is already invalid (strings inside are cutted). Its too late to use this function. I need another way to parse the xml file that will accept HTML entities and not cut them. – Fabien Nov 09 '10 at 09:46
Possibly, looking at this site [Using XPATH and HTML Cleaner to parse HTML / XML](http://thinkandroid.wordpress.com/2010/01/05/using-xpath-and-html-cleaner-to-parse-html-xml) might be more help? – Scoobler Nov 09 '10 at 09:50
See the very similar post, the user has use xmlpullparser - [Parsing html numbers in xml](http://stackoverflow.com/questions/4132092/parsing-html-numbers-like-189-in-dom-parser-android/4132536#4132536) Maybe this may help? – Scoobler Nov 09 '10 at 18:06

Android decoding html in xml file

3 Answers3

Linked