2

There are data :

batiment:Kube D
etage:4ème
description:some_description

I want to get these data through InputStreamReader stuff :

SharedByteArrayInputStream sbais = (SharedByteArrayInputStream) content;
Reader reader = new InputStreamReader(sbais, Charset.forName("UTF8"));
int size = sbais.available();
char[] theChars = new char[size];
int data = reader.read();
int i = 0;
while (data != -1) {
    theChars[i] = (char) data;
    i++;
    data = reader.read();  
}
String parse = new String(theChars);
String[] parties = parse.split("Content-Transfer-Encoding: quoted-printable");
String partie = (parties[1]).trim();
parties = partie.split("\\R");
String ret = "";
for(String ligne : parties) {
   if (ligne == null || ligne.trim().equals(""))
        break;
   ret = ret.concat(ligne).concat(System.lineSeparator());
}
return ret;

At runtime the data 4ème is transformed to 4=E8me

So what is wrong ?

edit :

here is the headers of the content :

--_008_DB6P190MB0166B6F4DE5E31397B4A7B558C3C9DB6P190MB0166EURP_
Content-Type: multipart/alternative;
    boundary="_000_DB6P190MB0166B6F4DE5E31397B4A7B558C3C9DB6P190MB0166EURP_"

--_000_DB6P190MB0166B6F4DE5E31397B4A7B558C3C9DB6P190MB0166EURP_
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

batiment:KUBE D
etage:4=E8me
description:andrana

Cordialement,

...
pheromix
  • 18,213
  • 29
  • 88
  • 158

2 Answers2

5

We can see that you ignore everything in your content before the String Content-Transfer-Encoding: quoted-printable.

That means that your initial content is actually 4=E8me, which correspond to an ISO-8859-1 string, encoded with quoted-printable.

If you want to transform it to 4ème, you have to decode it.

There is nothing out of the box for this, but this answer will give you some ideas of library you can use.

For example using Apache Common Codec, it would be something like:

    partie = new QuotedPrintableCodec(StandardCharsets.ISO_8859_1).decode(partie);
obourgain
  • 8,856
  • 6
  • 42
  • 57
  • I use the Apache Common Codec but I got exception `Invalid URL encoding: not a valid digit (radix 16): 34` – pheromix Jun 03 '21 at 09:01
  • It depends on the content of your input String. In the content, you should find headers which describe the encoding. Something like `Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: quoted-printable` – obourgain Jun 03 '21 at 11:14
  • and what should I do when I find the headers ? – pheromix Jun 03 '21 at 11:30
  • Add them to your question. It is difficult to help you because we don't know how your String is encoded, or even if it is correctly encoded. To get a good answer, we need to have the content of the variable `parse` from your sample. – obourgain Jun 03 '21 at 11:34
  • ok, I edited the post to include the header before the data I want to retrieve. – pheromix Jun 03 '21 at 11:57
  • Ok, I found the right place to put the decoding :) , tyvm – pheromix Jun 03 '21 at 13:26
0

After following Obourgain's answer here is the working code :

SharedByteArrayInputStream sbais = (SharedByteArrayInputStream) content;
Reader reader = new InputStreamReader(sbais, StandardCharsets.ISO_8859_1);
int size = sbais.available();
char[] theChars = new char[size];
int data = reader.read();
int i = 0;
while (data != -1) {
    theChars[i] = (char) data;
    i++;
    data = reader.read();  
}
String parse = new String(theChars);
String[] parties = parse.split("Content-Transfer-Encoding: quoted-printable");
String partie = (parties[1]).trim();
String[] lignes = partie.split("\\R");
String ret = "";
for(String ligne : lignes) {
     if (ligne == null || ligne.trim().equals(""))
        break;
     String tmp = new QuotedPrintableCodec(StandardCharsets.ISO_8859_1).decode(ligne, StandardCharsets.ISO_8859_1);
     ret = ret.concat(tmp).concat(System.lineSeparator());
}
return ret;
pheromix
  • 18,213
  • 29
  • 88
  • 158