0

I get an String Stream form HTTP Request. The Stream looks like:

<?xml version="1.0" encoding="utf-8"?>

The first three tokens means that the String is encoded to UTF-8.

I'm making Files with the String. While reading them i get an error:

With this method i'm making Files with that String:

private void writeToFile(String data, String fileName) {
    try {
        String UTF8 = "UTF-8";
        int BUFFER_SIZE = 8192;

        String xmlCut = data.substring(3);

        File sdCard = Environment.getExternalStorageDirectory();
        File dir = new File (sdCard.getAbsolutePath()+"/example/Test");
        dir.mkdirs();
        File file = new File(dir,fileName);

        FileOutputStream f = new FileOutputStream(file);
        FileOutputStream fileOutputStream = openFileOutput(fileName, Context.MODE_PRIVATE);
        BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(fileOutputStream,UTF8),BUFFER_SIZE);
        bufferedWriter.write(String.valueOf(data.getBytes("UTF-8")));
        f.write(data.getBytes("UTF-8"));
        f.close();
        bufferedWriter.close();
    } catch (IOException e) {
        Log.e("writeToFile: ", "Datei-Erstellung fehlgeschlagen: " + e.toString());
    }

}

As you can see, I've added the substring method to cut the first three tokens because this leads to a crash. The Problem is that then the Files are coded in ASCI.

Method to read files:

 private String readFromFile(String fileName) {
    String ret = "";
    String UTF8 = "UTF-8";
    int BUFFER_SIZE = 8192;

    try {
        InputStream inputStream = openFileInput(fileName);

        if (inputStream != null) {


            BufferedReader bufferedReader1 = new BufferedReader(new InputStreamReader(inputStream,UTF8),BUFFER_SIZE);
            String receiveString = "";
            StringBuilder stringBuilder = new StringBuilder();

            while ((receiveString = bufferedReader1.readLine()) != null) {
                stringBuilder.append(receiveString);
            }

            inputStream.close();
            ret = stringBuilder.toString();
        }
    } catch (FileNotFoundException e) {
        Log.e("readFromFile: ", "Datei nicht gefunden: " + e.toString());
    } catch (IOException e) {
        Log.e("readFromFile: ", "Kann Datei nicht lesen: " + e.toString());
    }
    return ret;
}

If i don't cut the UTF-8 tokens then i get this error from stacktrace:

Caused by: java.lang.NullPointerException: Attempt to invoke interface method 'org.w3c.dom.NodeList org.w3c.dom.Document.getElementsByTagName(java.lang.String)' on a null object reference
        at de.example.app.ListViewActivity.setListProjectData(ListViewActivity.java:226)

It's here:

public void setListProjectData(String filename) {

    XMLParser parser = new XMLParser();
    String xmlData = readFromFile(filename);
    String xmlCut = xmlData.substring(3);
    Document doc = parser.getDomElement(filename);

    NodeList nodeListProject = doc.getElementsByTagName(KEY_PROJECT);


    for (int i = 0; i < nodeListProject.getLength(); i++) {

        HashMap<String, String> map = new HashMap<String, String>();
        Element e = (Element) nodeListProject.item(i);

        map.put(KEY_UUID, parser.getValue(e, KEY_UUID));
        map.put(KEY_NAME, parser.getValue(e, KEY_NAME));
        map.put(KEY_JOBTITLE, parser.getValue(e, KEY_JOBTITLE));
        map.put(KEY_JOBINFO, parser.getValue(e, KEY_JOBINFO));
        map.put(KEY_PROJECTIMAGE, parser.getValue(e, KEY_PROJECTIMAGE));


        projectItems.add(map);
    }
}

I get the data from HTTP by here:

public String getXMLFromUrl(String url) {
    String xml = null;

    if (cd.isConnectingToInternet()) {
        try {
            //defaultHttpClient
            DefaultHttpClient httpClient = new DefaultHttpClient();
            HttpPost httpPost = new HttpPost(url);

            HttpResponse httpResponse = httpClient.execute(httpPost);
            HttpEntity httpEntity = httpResponse.getEntity();
            /*
            final InputStream in = httpEntity.getContent();
            Reader reader = new InputStreamReader(in,"UTF-8");
            InputSource is = new InputSource(reader);
            is.setEncoding("UTF-8");

*/ xml = EntityUtils.toString(httpEntity);

        } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
        } catch (ClientProtocolException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    } else {
        return null;
    }

    return xml;

So, how can i encode them to UTF-8? Am I doing it right?

korunos
  • 768
  • 3
  • 11
  • 31

1 Answers1

0

Your problem is not in the code you have posted but in code that gets data from HTTP request.

You are passing String data to writeToFile method. String in Java is UTF-16 encoded. If you have UTF-8 encoded data in that string, then no amount of further encoding-decoding is going to fix that data that is already broken.

You should use xml = EntityUtils.toString(httpEntity, HTTP.UTF_8) to decode data properly.

There is additional issue if returned data contains UTF-8 BOM. Above line will correctly decode data, but it will leave superfluous (and wrong) BOM in place.

To solve that either server has to return data without BOM, or BOM has to be stripped down. To do so following code (or similar can be used)

public static String stripBOM(InputStream stream)
{
    try
    {
        byte[] buffer = new byte[1024];
        ByteArrayOutputStream os = new ByteArrayOutputStream(1024);
        byte[] bom = new byte[3];
        stream.read(bom);
        int bytesRead;
        while ((bytesRead = stream.read(buffer)) != -1)
        {
            os.write(buffer, 0, bytesRead);
        }
        os.close();
        return os.toString("UTF-8");
    }
    catch (IOException e)
    {
        return "";
    }
}

So xml = EntityUtils.toString(httpEntity, HTTP.UTF_8) can be replaced with

 InputStream is = httpEntity.getContent();
 xml = stripBOM(is);
Dalija Prasnikar
  • 27,212
  • 44
  • 82
  • 159