0

I'm trying to build an Android application using the zbar library to scan codes. I generated QRcodes with UTF-8 encoding and am using this Android app to scan them. The text I'm encoding is "L'étoile". I tried the default zbar test program and noticed that it doesn't display accented characters correctly. So I slightly modified its code below in order to debug it and understand why it's failing to display characters correctly.

byte[] bytes = sym.getDataBytes();
String latin1Result = new String(bytes, "ISO8859-1");
String utf8Result = new String(bytes, "UTF-8");
Log.d("CUSTOM_DEBUG_TAG", "result " + sym.getData() + ", string " + sym.getData().toString() + ". latin1 result " + latin1Result + ". utf8 result " + utf8Result);

From the log I get:

CUSTOM_DEBUG_TAG(11987): result L'テゥtoile, string L'テゥtoile. latin1 result L'ï¾ï½©toile. utf8 result L'テゥtoile

I'm a bit lost when it comes to character sets and encodings so please bear with me. From the log above, can I affirm that the zbar library is actually returning a UTF-8 encoded string "L'étoile"? If so, shouldn't it display correctly in the log?

I believe zbar uses iconv and defaults to ISO-8859-1. So I also tried to generate a QRcode with iso-8859-1 encoded text. I then tried to read the QRcode with the Android application and the log showed this:

CUSTOM_DEBUG_TAG(11987): result L'騁oile, string L'騁oile. latin1 result L'é¨oile. utf8 result L'騁oile

So as you can see I'm unable to retrieve the accented string "L'étoile". Obviously, there are concepts I'm unable to grasp and am hoping for some help.

By the way, if I scan the same QRcode with applications such as QR Droid or Zxing, I get the string correctly displayed as "L'étoile" (thus I'm discarding the fact that there could be problems with the QRcode itself).

Thanks

user1936810
  • 71
  • 2
  • 7
  • what do you get when you create new String from the bytes without using an encoding ? – Moh Sakkijha Dec 29 '12 at 19:30
  • The code would be `byte[] bytes = sym.getDataBytes(); String Result = new String(bytes); Log.d("CUSTOM_DEBUG_TAG", "Result " + Result); ` and if I use a utf-8 encoded QRcode I get: `Result L'テゥtoile` – user1936810 Dec 29 '12 at 19:40
  • try not to use any encoding like String data = new String(bytes); what that gives you ? – Moh Sakkijha Dec 29 '12 at 19:42
  • and if I scan an ISO-8859-1 encoded QRcode then the result is `Result L'騁oile` – user1936810 Dec 29 '12 at 19:46

2 Answers2

5

After some trial and error, it seems that zbar don't use the ISO-8859-1 encoding but the Shift_JIS when special characters are found. Here what works for me:

byte[] b = sym.getData().getBytes("Shift_JIS");
String value = new String(b, "UTF-8");
jdtremblay
  • 51
  • 1
  • 4
0

There is a missing dash (-).

String latin1Result = new String(bytes, "ISO-8859-1");

Simply using new String(bytes) will use the default platform encoding, which creates a not very portable application.

If that does not help, try string literals to see whether the problem lies with the outputting.

String result = "\u00e9toile";
Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
  • if my code is: `byte[] bytes = sym.getDataBytes(); String Result = new String(bytes); String latin1Result = new String(bytes, "ISO-8859-1"); String utf8Result = new String(bytes, "UTF-8"); String literalResult = "\u00e9toile"; Log.d("CUSTOM_DEBUG_TAG", "result " + sym.getData() + ", string " + sym.getData().toString() + ". latin1 result " + latin1Result + ". utf8 result " + utf8Result + ". Result " + Result + ". literalResult " + literalResult);` – user1936810 Dec 30 '12 at 09:58
  • then the log output is: `CUSTOM_DEBUG_TAG(13422): result L'テゥtoile, string L'テゥtoile. latin1 result L'ï¾ï½©toile. utf8 result L'テゥtoile. Result L'テゥtoile. literalResult étoile`. So what does this suggest? That the zbar library is giving me a corrupt array? – user1936810 Dec 30 '12 at 09:59
  • Yes, give the zbar forums/FAQ/HowTos a try. I found the [same problem](http://sourceforge.net/projects/zbar/forums/forum/1072195/topic/5068736). Sorry I could not help. – Joop Eggen Dec 30 '12 at 18:51
  • The zbar forum post you refer to actually hints that an iOS user has successfully displayed the string via encoding:NSUTF8StringEncoding. However, it doesn't seem to be the case here. Anyway, I think zbar is the culprit because the same encoding issues were found with zbar command-line tools. So thanks anyway, even though my problem is unsolved. At least I know now that it seems to be an issue with the external library. – user1936810 Dec 31 '12 at 18:58
  • Success with your further search. – Joop Eggen Dec 31 '12 at 20:50