0

Developing an Android application to scan Data Matrix codes using Google's MLKit, I'm unable to parse codes with data encoded in ISO-8859-1 encoding and containing Non-ASCII characters.

Here's an example: DataMatrix failing with ML Kit

val options = BarcodeScannerOptions.Builder()
    .setBarcodeFormats(Barcode.FORMAT_DATA_MATRIX)
    .build()
val scanner = BarcodeScanning.getClient(options)
scanner.process(image).addOnSuccessListener { barcodes ->
  val barcode = barcodes.firstOrNull()
  Log.i(TAG, barcode?.rawValue)
  Log.i(TAG, String(barcode?.rawBytes!!, StandardCharsets.ISO_8859_1))
}

Both log statements return a string representing "Unknown encoding" (literally).

I'm using the latest version of MLKit's barcode-scanning library:

implementation 'com.google.mlkit:barcode-scanning:16.1.1

ZXing does the job, but it's having much more difficulties recognizing real life, not-so-perfect scans.

Any idea, any hint?

  • Maybe it's being processed in text mode? In text mode, it can only process ASCII characters. [Data Matrix - Wikipedia](https://en.wikipedia.org/wiki/Data_Matrix) Is there any option in both software to explicitly specify Base256 mode, both when encoding and when decoding? Perhaps if you're dealing with ISO-8859-1 encoded data, you'll probably need to set those modes and options to convert the string to a byte array. – kunif Feb 02 '21 at 02:41
  • Could you share your image of the barcode of DataMatrix with ISO-8859-1 encoding? We will forward the case to our research team for further investigation. Thanks! -- From ML Kit Team – Julie Zhou Feb 19 '21 at 21:28
  • @JulieZhou Just added an example of an official German Medication Plan DataMatrix with ISO-8859-1 encoding. Thank you in advance for your help! – Dirk Spöri Feb 22 '21 at 19:00
  • Thanks Dirk, will take a further look with the example barcode. – Julie Zhou Feb 23 '21 at 14:46
  • Thank you, Julie! Hope there's a solution - it's a rather common usecase. Ugly that we still have to deal with non-unicode encodings, but there's no choice. – Dirk Spöri Feb 23 '21 at 20:46
  • Hi Dirk, please refer to the answer of Chenxi Song, we tested internally that this was not an encoding issue, but a padding problem. We are considering adding the padding limit on our page. Please let us know if adding the padding would work for you. Thanks! – Julie Zhou Feb 27 '21 at 18:43
  • Hi Julie, sorry about the padding, but it doesn't solve the problem. `barcode.rawvalue` does contain the string `Unknown encoding`, see my comment on Chenxi Song. – Dirk Spöri Feb 28 '21 at 21:15
  • Hi Julie, if you compare the results of MLKit and ZXing, the bug in MLKit seems obvious to me. MLKit: `

    – Dirk Spöri Mar 03 '21 at 19:27
  • Hi Dirk Spöri, as Chenxi mentioned, we've forward this problem to our research team for a further investigation, thanks for reporting the problem and help to provide the details. Will update when we get back from the research team. – Julie Zhou Mar 20 '21 at 19:05

2 Answers2

0

The problem with the barcode picture is that there is no padding around the barcode, which would return an empty result.

After adding some padding around the barcode content, the barcode is detected. barcode is detected. Detected result

Chenxi Song
  • 557
  • 3
  • 6
  • Sorry about the padding, but it still fails. In my app, the output of `barcode?.rawvalue` contains random "Unknown encoding" strings: `

    – Dirk Spöri Feb 28 '21 at 20:30
  • Hi, by using mlkit vision quiclkstart app, with your BarcodeScanner setting I can get result from both a live preview when scanning this barcode and also with this image downloaded to my devices and converted to a bitmap. Could you check the mlkit quickstart app code here: https://github.com/googlesamples/mlkit/blob/369dd896f4b7fadeedd3e0860f2e1695db0b4d9b/android/vision-quickstart/app/src/main/java/com/google/mlkit/vision/demo/kotlin/StillImageActivity.kt#L303 for reading a bitmap from url and feed it to the barcode scanner api? There might be an issue with the InputImage. – Chenxi Song Mar 02 '21 at 21:38
  • Hi Chenxi, right now, I retried with Vision QuickStart, again with the same result, again broken. See the "Unknown Encoding" string breaking the XML due to the Non-ASCII characters: `2021-03-03 20:22:21.496 18480-18480/com.google.mlkit.vision.demo I/BarcodeProcessor:

    ...`
    – Dirk Spöri Mar 03 '21 at 19:23
  • Hey sorry for misreading your previous reply. I will check with internal teams to see if they can improve this result. "barcode raw value:

    – Chenxi Song Mar 19 '21 at 20:43
  • Hi Dirk Spöri, do you mind us to use this [image](https://i.stack.imgur.com/F9JEj.png) in our internal unit test? – Julie Zhou Apr 07 '21 at 18:31
  • Sorry that I missed answering this comment. But you fixed it, thank you :) – Dirk Spöri May 25 '21 at 22:11
  • Hi Julie Zhou and Chenxi Song, thank you very much for taking care of this issue and fixing it with the latest library version. This fix will be very, very helpful as the ML Kit image recognition for blurry / printed codes is great. – Dirk Spöri May 25 '21 at 22:18
0

With the latest version of the ML Kit Barcode Scanning library 16.1.2, the rawBytes method now returns the expected content of the QR code, also for non-unicode encodings.

implementation 'com.google.mlkit:barcode-scanning:16.1.2'