1

I parsing the xml using sax parser in android. My xml structure is as given below

   <customerlist>
        <Customer>
              <customerId>2</customerId>
              <customerFname>prabhu</customerFname>
              <customerLname>kumar</customerLname>

              <customerImage>
                       iVBORw0KGgoAAAANSUhEUgAAADAAAAAwCAYAAABXAvmHAAAABHNCSVQICAgIfAhkiAAACJ9JREFUaIHtmllsXFcZx3/n3Due8XgcO3a8xG72tGkTknRJG7qAaEtbhFjVJUpBoBahgkoRUCFe+lCJvkAKCGgFPJQK0VKWqIhShMhL0mahLU1ImjZp0ziZLN5dezxje5Z7z/l4uDOOl9mcOskLf+vTle79fM63nPM/53xn4P+4tFDzrDffkEoKpQxTgAZcoCYvzpT3FxKWwHAD5PLiT3k/DW6JRjQQAupuuadx/VWbYk9b5dcg+uJkQlnR4uaOvjn28J7ticPAOOARODVdtUQTYaDu+7/86GONLfqbvvUiF9DcknB1KJMYtL/a9shrTxA4kWNGFkploAaoa1+t7mxfWhcRqTgULwiUUpG+02N3Aj8jyEBupk4xBxRBBmLJVEqiEznEWgBEBI8MYVVbdHYpFEZ8tNIUT65CrAEFSjmAxZcsIRUp3p7WJFNZAWJAOi/+VJ1SE9IBQmKsgwGsAgth3cCVTZ9FRIFViBXESjC9rCKTG+MjLfcRdRZjJ7+d0/P8LEsab6YxshxjfMK6gU9f/hS+tfk+ZogBMdYhmI8FEpmGcpO4xs8pba1grQCWzcsfYVH0ckbHexkcP8K1nV+jNbaWfad+zliml8+seYpoaCEnh3ZzZfMXuKzhBg50P8v69nuJ1SzmpSPfwPiGnJ+jM7aZtW1fpCW6FusJ6Nk50Br8nNIEQ7posEtlQEGet0QQEayxdMQ2se/U06xsvh3P5NjQvpUzI6+zsul2Wus3MjT+PuO5IXybZVnjzYxMxLFiibhNdCf307HgBqI1bcTcdm5c+m26R/fjKBfBTvYzU+wMm+bkQDaXYiKTJJ1JIlaTM0lWNt7GFc2fIuZ2MJB6l8M9L+JZD7GKkx/sYTDVRSY7Qcab4N8nf01qYpDB1HFOD7+OlhqyuVE8P4OIomvwFTw/O9nHTJnIJMnmUlNtqnoIAbChfQurlrVjjUcs0kEy3cuZ0TdYuehWbln5XerDHVy/5AHaF1xNrdPA+sX3oHFRQHPdajYv/TqezXJ151Y2dNzLW90vsDC6irHcAErB1mtfIKQj3Ljs4fyknhFdreia6AO2lbSxOFVAB7B4//7/Prdx4/o1xsxaPy4KHMfh0KHD71133TVfBnqBQWZQaYVtgS3/+aKgvA1lh5CIYG0wwS4Fqum7rANwjoUuBarpt6wDxhistVh76YZSpflXMQNQXSQuFSo6MJ8ZUCqgcpHq2gv0y6MsC8135K21ZPzROf1PJRsqnq4KGZgP8UyavfEfEeyvyusqpfA8j1gs1lBfXx8qZd9FYyGlVDAklINCTbYpMzbSBb2xsTF6enro7e11CQJddDyVdcDzvEkm+jBwdASXEFkzhuenGcmcIqwXUBdqQQDfpid1tdb09PTQ399PLBYD8H3fh/PZC31YKDQoxenRvZxK7KQ79QYT3iBnj76GVmE6Y5tYsfBWVjTeBiiEYPgMDw+jdTC6RUSLSCEDc3PAGDO5GpeHENJhjPWx+eKBq2s53P8HDvb/jpxJnXMIhZGg2BBPvMqpxKvs1j/m+sUPcVXr3QiGbDaL67po5VQsrMxLicRVYU6Pvs64N4RWLq6KcHp0L/vOPEnWT+a70QggYvnY0sf4xLLHQRSCxjMT7DmzjUQ6jkITCoXIeR4D40cZzZ4t60LFvZAxpuJqaMjQFt3Ie8Mv0538D2GnnmMf/A1HR2EWCSjiiVfQSiOBRygctGj+dHgLd699nivWrObZXQ8yMjSGzjSXXQzmiYUEUKxv2UJbbCMvHrkfV0colf/4yE5AcHRtEADxcHSEj1/2KCKGZKYbZ1GcsFXE3z47lslk/HxjVRe2gHNrQDU0Kgi+8Xl34K84EsEzaRQurqqZRZWaoMzkmzSOruWOFU8iWPbEn2DCJNiybjvt4evRzYdo6bRTq3KzDKm4Ele7BigU49l+TgzvxEiOdS1baa1bh7FZRJghFovhsvqbuXPVTzmZ2MWO44+SNuNoQhzqf57rFj+Eby3KmTS8qCFV0Wg164BCEx/dTSrTS0hHyPnjGONN61aweGaCNYs+x7qWeznY91v+dex7CIImPKn7Tt+fqXWaaIqspts/CGVONRVZqFS1YFb1QIRUti8oWukQx4b+zgcTx0DVBPyOwVERNrR9hVUL72Lv6SeJJ3ZPDi+Z8uc6Ed4Z3M6G9vtR0wM/tzlQWAOqywBY6wexUoKTj6iIh+NEuabtAcJuI28NPMc7A39BowKdok1rcjZFOpvgjhXb+AX3lex33krlglAXaseK5AesYMSjvqaTu1b9BKVcdp18fJLrJR/bUgKaQ/3PYWRaJXFuK7Hv+3M4EwtLGm6irqaF8dwgrdH13LLsB5wc3sk/3v0WILiqNp+VakKiSOdG2Bn/YVmted2N1jmtLG/4JNFQE50LNrMz/jgj6RPU6DrO75JHIbOvBKZh3uaAo2tAFJs7v8OLR7/E/t5ncHAIqeh5b8eV5AvEZTAvu1GlHPpShzibfI0lDTfSWreOROZEsJW4wKiYganPYtBoDvT8hsMDf8TVtRwZ2s7nr3iGjDdCd/JNlDp/nhCBCgkoz0KFCVye/2Eo/T5KhYLmxOFg3++5uu1BLFJkFZ6DWCoWB0s5MOl3ZQcMa1u25HsEJYoTwzs4PrKDluhVGJMtzZXVyjmbqt4LCQTF1cIZtbTAkoabWNl0B6IEFLhOLceH/8nGtq+indC5s9QcRcFU8prTXsgAmQMHDox0dXUlfd/XgBKRElyoEFnO290vTWtirPllrGyma3gHag40KiBoRCtsZlxGgAxFrliDnou/a8rLJmA50AhECO6qLgY8AqMTQBx4ExgGkvlv04wthhhQB7QCC4B6gpvL2bcQFwYGyAIpAqMHCO6Jx6nynrgQgWGCq80kgfEX+mcGBVgCJwpXqxlmRL6AUhnQBAa7U54X43cSBRROYD6BI4XnLFL9H0iaJNCEw0eHAAAAAElFTkSuQmCC

        </customerImage>


        </Customer>

</customerlist>

I am able to get customerId, customerFname, customrLname, but for customerImage I am not getting complete string I am only getting part of the string i.e (iVBORw0KGgoAAAANSUhEUgAAADAAAAAwCAYAAABXAvmHAAAABHNCSVQICAgIfAhkiAAACJ9JREFUaIHtmllsXFcZx3/n3Due8XgcO3a8xG72tGkTknRJG7qAaEtbhFjVJUpBoBahgkoRUCFe+lCJvkAKCGgFPJQK0VKWqIhShMhL0mahLU1ImjZp0ziZLN5dezxje5Z7z/l4uDOOl9mcOskLf+vTle79fM63nPM/53xn4P+4tFDzrDffkEoKpQxTgAZcoCYvzpT3FxKWwHAD5PLiT3k/DW6JRjQQAupuuadx/VWbYk9b5dcg+uJkQlnR4uaOvjn28J7ticPAOOARODVdtUQTYaDu+7/86GONLfqbvvUiF9DcknB1KJMYtL/a9shrTxA4kWNGFkploAaoa1+t7mxfWhcRqTgULwiUUpG+02N3Aj8jyEBupk4xBxRBBmLJVEqiEznEWgBEBI8MYVVbdHYpFEZ8tNIUT65CrAEFSjmAxZcsIRUp3p7WJFNZAWJAOi/+VJ1SE9IBQmKsgwGsAgth3cCVTZ9FRIFViBXESjC9rCKTG+MjLfcRdRZjJ7+d0/P8LEsab6YxshxjfMK6gU9f/hS+tfk+ZogBMdYhmI8FEpmGcpO4xs8pba1grQCWzcsfYVH0ckbHexkcP8K1nV+jNbaWfad+zliml8+seYpoaCEnh3ZzZfMXuKzhBg50P8v69nuJ1SzmpSPfwPiGnJ+jM7aZtW1fpCW6FusJ6Nk50Br8nNIEQ7posEtlQEGet0QQEayxdMQ2se/U06xsvh3P5NjQvpUzI6+zsul2Wus3MjT+PuO5IXybZVnjzYxMxLFiibhNdCf307HgBqI1bcTcdm5c+m26R/fjKBfBTvYzU+wMm+bkQDaXYiKTJJ1JIlaTM0lWNt7GFc2fIuZ2MJB6l8M9L+JZD7GKkx/sYTDVRSY7Qcab4N8nf01qYpDB1HFOD7+OlhqyuVE8P4OIomvwFTw/O9nHTJnIJMnmUlNtqnoIAbChfQurlrVj )

My xmlHandler code is below

import java.util.ArrayList;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

import android.util.Log;

import com.bvbi.invoicing.client.android.customer.model.CustomerPojoInList;

public class CustomerListParser extends DefaultHandler {

    Boolean currentElement = false;
    String tempValue = null;


    CustomerPojoInList customer = null;
    public static ArrayList<CustomerPojoInList> customers = null;

    @Override
    public void startDocument() throws SAXException {
         customers = new ArrayList<CustomerPojoInList>();

    }

    /** Called when tag starts ( ex:- <name>AndroidPeople</name> 
     * -- <name> )*/
    @Override
    public void startElement(String uri, String localName, String qName,
            Attributes attributes) throws SAXException {

        currentElement = true;

        if (localName.equals("Customer"))
        {
            /** Start */ 
            customer = new CustomerPojoInList();

        }


    }

    /** Called when tag closing */
    @Override
    public void endElement(String uri, String localName, String qName)
            throws SAXException {

        currentElement = false;
        String currentValue = tempValue;
        tempValue = "";
        /** set value */ 

        if (localName.equalsIgnoreCase("customerId"))
            customer.setCustomerId(currentValue.toString());

        else if (localName.equalsIgnoreCase("customerFname"))
            customer.setCustomerFname(currentValue.toString());

        else if (localName.equalsIgnoreCase("customerLname"))
            customer.setCustomerLname(currentValue.toString());
        else if (localName.equalsIgnoreCase("customerImage"))
        {
            Log.d("prabhu","Customer image in parser......"+currentValue);
            customer.setCustomerImage(currentValue.toString());
        }
        else if (localName.equalsIgnoreCase("Customer"))
            customers.add(customer);

    }

    /** Called to get tag characters */
    @Override
    public void characters(char[] ch, int start, int length)
            throws SAXException {

        if (currentElement) {
            tempValue = new String(ch,start, length);

            if(tempValue.equals(null))
                tempValue = "";
            currentElement = false;

        }

    }
      @Override
      public void endDocument() throws SAXException {


        }

}

Please help me to fix the issue.

Prabhu M
  • 3,534
  • 8
  • 48
  • 87
  • First off, its actually a real bad use of sax parser. You are parsing the entire xml and using pojo's to create objects out of it. Why not just use DOM parser. Save like 90% in lines of code and a million bugs. I say that since I am plagued by developers using sax this way. Its a misconception that DOM uses more memory or is slow. Use DOM. – Siddharth Feb 02 '12 at 08:03
  • @Siddharth: Okay, it is PHP, but have a look at http://p0l0.binware.org/index.php/2011/07/04/simplexml-vs-xmlwriter-vs-dom - DOM is slow and uses a lot of memory. I haven't done it in Android so far, but I am pretty sure, there is a StAX-implementation around. Maybe even Woodstox? Whis would be my pick for doing it in Java. Don't use DOM! Since you are on Android (a mobile phone?), memory and performance really matters! – Max Feb 02 '12 at 08:15
  • I have used DOM on Android, and its really cool. Nothing to worry about. Its a misconception that android devices are slow. Currently android runs a 1GHz processor with 2Gb memory. Seriously, dont worry about it. Its faster and code is so much less (90%). No pojo's no set methods no get methods, no spelling mistakes, no if then else for every tag. Everytime the webservice changes you need to test the entire flow. With dom, you only write get for those elements you use/need. – Siddharth Feb 02 '12 at 08:19
  • Not just this one issue, you need to a lot more to extract attributes from the xml. Code gets crazier for every new xml, every new tag, every new attribute and every new "type" of value. You are better off with DOM and a few gets. Traversing the tree everytime for every element wont be as slow as you think. – Siddharth Feb 02 '12 at 08:20
  • @Siddharth: I can't see any getters or POJOs for StAX parsing. Either way, cursor or event-iteration, you can just put if's for the desired tagname or attribute and you are there. So even if "the webservice changes", you just modify the clause in there. Also is less code not necessarily performing better. Anyways...this is an interesting discussion, but unfortunately not solving Prabhu's problem. – Max Feb 02 '12 at 09:00
  • @Siddharth: Hi have a look at this http://www.developer.com/ws/article.php/3824221/Android-XML-Parser-Performance.htm – Prabhu M Feb 02 '12 at 09:10
  • customer = new CustomerPojoInList(); is the pojo. Its a known fact that webservice change control is a huge issue. Versioning is rarely done. My experience is from practically falling into these pits multiple times. For Prabhu's scenario DOM will take care of all his buffering, attribute and future issues. The solution to all Prabhu's today and tomorrow issues is DOM. Move to DOM, get things working and then if performance is a issue think of SAX. I can bet he wont look at SAX later. – Siddharth Feb 02 '12 at 09:16
  • @Prabhu : The test runs 10,000. Your app wont run more than 10 at a time. Please dont get misled by benchmark numbers. They are important but not as important, when it comes to simpler applications. If your application does parse 10,000 records one after another, then I rest my case. If not, I still think you should move to DOM. – Siddharth Feb 02 '12 at 09:21
  • @Siddharth: again, I'm talking of StAX, not of SAX. Please note the _t_ in there. There are no POJOs! Parsing a stream, I'm not interested in objects, I'm interested in events, that I can handle any way I wish (eventhough, I could use a POJO if I so whished, but obviously I really don't). Reading Prabhu's XML, I also very much doubt, that this XML will change a lot. That's customer data, so I really think, Prabhu's XML is produced inhouse and - eventhough this might sound sci-fi - that people in this company talk to each other before they change it. Maybe he even produces the XML himself. – Max Feb 02 '12 at 09:53
  • @Max : StAX I rest my case. Yes, if interested in events, it is the right way to use. And agree, that for very simple xml, sax is a good option. – Siddharth Feb 02 '12 at 10:23

3 Answers3

1

In sax parser, characters() method parses only maximum of 1024 characters each time. So we need to append the strings until all the characters are parsed.

I changed the above code as follows

public void characters(char[] ch, int start, int length)
        throws SAXException
{


    Log.d("prabhu","Customer image length in parser......"+length);
    if (currentElement ) {

        tempValue = new String(ch,start, length);

        if(tempValue.equals(null))
            tempValue = "";



    }
        tempValue = tempValue+new String(ch,start, length);
 }
Prabhu M
  • 3,534
  • 8
  • 48
  • 87
  • Hi Prabhu, glad it works now. One more question, as I would also like to know it, does Log.d() cause the 1024 limitation or is it the parser? If it is the parser, then you might again consider using a different method doing your parsing...okay, it works, but I don't think, that's a nice solution. – Max Feb 03 '12 at 09:50
  • @Max Its the parser. Implementation of SAX itself is like this for better performance. – Prabhu M Feb 03 '12 at 10:19
  • I see. Anyways, I would reccommend you to use another parser. Maybe now it is a bit more work to do, but you'll be happy afterwards. – Max Feb 03 '12 at 10:25
  • Thanks, But Performance matters more compared to hardwork. I think using Sax is better for performance. I think this http://www.developer.com/ws/article.php/3824221/Android-XML-Parser-Performance.htm might be useful. – Prabhu M Feb 03 '12 at 12:11
  • StAX (note the "t" in there) would be performing much better and work better as well. – Max Feb 03 '12 at 12:14
  • Thanks Max , I will look into StAX. – Prabhu M Feb 03 '12 at 12:16
  • Where do you set the **boolean currentElement** value back to false in the above function ? – LeDerp Jul 09 '15 at 15:38
0

first time post. Updated answer with something that might help others. I hope it is not too specific to my particular problem. I am parsing an RSS feed that I create myself with a really long description but the other tags you are interested in, i.e. feed title, date and URL are always short. The description contains information about social events. Within the description, I use tags that I later parse to give me information about the event such as event date (different from RSS pubDate), (Location), (ticketDetails), (Phone), etc, you get the idea.

A good way to handle this is with a slight modification of the answer in this post. I added tags to the description for (Event) and (EndEvent) and I keep appending to my String Builder until I get "(EndEvent)". That way i know i have the full string. It might not work for your situation if you dont control the feed unless you know there is always a certain string at the end of your RSS description.

Posting in case this (cough, hack) helps anyone. Code is as follows:

@Override
public void startElement(String uri, String localName, String qName, Attributes attributes) 
        throws SAXException {

    strBuilder =  new StringBuilder();

    if ("item".equals(qName)) {
        currentItem = new RssItem();
    } else if ("title".equals(qName)) {
        parsingTitle = true;
    } else if ("link".equals(qName)) {
        parsingLink = true;
    }
     else if ("pubDate".equals(qName)) {
            parsingDate = true;
        }
     else if ("description".equals(qName)) {
            strBuilder =  new StringBuilder(); //reset the strBuilder variable to null
            parsingDescription = true;
        }
}


@Override
public void endElement(String uri, String localName, String qName) throws SAXException {

    String descriptionTester = strBuilder.toString();

    if ("item".equals(qName)) {
        rssItems.add(currentItem);
        currentItem = null;
    } else if ("title".equals(qName)) {
        parsingTitle = false;
    } else if ("link".equals(qName)) {
        parsingLink = false;
    }
    else if ("pubDate".equals(qName)) {
        parsingDate = false;
    }
    //else 
    //  currentItem.setDescription(descriptionTester);
    else if ("description".equals(qName) && descriptionTester.contains("(EndEvent)")) {
        parsingDescription = false;
    }
}


@Override
public void characters(char[] ch, int start, int length) throws SAXException {

    if (strBuilder != null) {
        for (int i=start; i<start+length; i++) {
            strBuilder.append(ch[i]);
        }
    }

    if (parsingTitle) {
        if (currentItem != null)

            currentItem.setTitle(new String(ch, start, length));
            parsingTitle = false;

    }

    else if (parsingLink) {
        if (currentItem != null) {
            currentItem.setLink(new String(ch, start, length));
            parsingLink = false;        
    }
    }
    else if (parsingDate) {
        if (currentItem != null) {

            currentItem.setDate(new String(ch, start, length));
            parsingDate = false;

    }
    }

    else if (parsingDescription) {
        if (currentItem != null && strBuilder.toString().contains("(EndEvent)" )) {

                   String descriptionTester = strBuilder.toString();

            currentItem.setDescription(descriptionTester);

            parsingDescription = false;

            }
    } 
} 

As I said, hope that helps someone as I was stumped on this for a while!

0

The output you posted is exactly 1024 characters. This looks like a certain buffer size. How do you get this output? Maybe check that method and / or your CustomerPojoInList.

I very much believe, that there is some buffer involved that has a maximum of 1024 characters...

Good luck!

Max
  • 1,000
  • 1
  • 11
  • 25
  • While parsing only I am getting this output – Prabhu M Feb 02 '12 at 09:37
  • What line produces this output? Maybe also output the length of the string, then you should see whether the string already is truncated or not. – Max Feb 02 '12 at 09:53