3

I have some binary data that I want to encode in a qr-code and then be able to decode, all of that in bash. After a search, it looks like I should use qrencode for encoding, and zbarimg for decoding. After a bit of troubleshooting, I still do not manage to decode what I had encoded

Any idea why? Currently the closest I am to a solution is:

$ dd if=/dev/urandom bs=10 count=1 status=none > data.bin
$ xxd data.bin
00000000: b255 f625 1cf7 a051 3d07                 .U.%...Q=.
$ cat data.bin | qrencode -l H -8 -o data.png
$ zbarimg --raw --quiet data.png | xxd
00000000: c2b2 55c3 b625 1cc3 b7c2 a051 3d07 0a    ..U..%.....Q=..

It looks like I am not very far, but something is still off.

Edit 1: a possible fix is to use base64 wrapping, as explained in the answer by @leagris .

Edit 2: using base64 encoding doubles the size of the message. The reason why I use binary in the first place is to be size-efficient so I would like to avoid that. De-accepting the answer by @leagris as I would like to have it 'full binary', sorry.

Edit 3: as of 2020-03-03 it looks like this is a well-known issue of zbarimg and that a pull request to fix this is on its way:

https://github.com/mchehab/zbar/pull/64

Edit 4: if you know of another command-line tool on linux that is able to decrypt qr-codes with binary content, please feel free to let me know.

Zorglub29
  • 6,979
  • 6
  • 20
  • 37

2 Answers2

11

My pull request has been applied. ZBar version 0.23.1 and newer will be able to decode binary QR codes:

zbarimg --raw --oneshot -Sbinary qr.png
zbarcam --raw --oneshot -Sbinary

QR codes have several encoding modes. The simplest, most commonly used and widely supported is the alphanumeric encoding which is suitable for simple text. The byte encoding allows storing arbitrary 8 bit data in the QR code. The ECI mode is like 8 bit mode but with additional metadata that tells the decoder which character set to use in order to decode the binary data back to text. Here's a list of known ECI values and the character encodings they represent. For example, when a decoder encounters an ECI 26 mode QR code it knows to decode the binary data as UTF-8.

The qrencode tool is doing its job correctly: it is creating a byte mode QR code with the data you gave it as its contents. The problem is most decoders were explicitly designed to handle textual data first and foremost. The retrieval of the raw binary data is a detail at best.

Current versions of the zbar library will treat byte mode QR codes as if they were unknown ECI mode QR codes. If a character set isn't specified, it will attempt to guess the encoding and convert the data to it. This will most likely mangle the binary data. As you noted, I brought this up in issue #55 and after some time managed to submit a pull request to improve this. Should it be merged, the library will have binary decoder option that will instruct decoders to return the raw binary data without converting it. Another source of data mangling is the tendency of the command line tools to append line feeds to the output. I submitted a pull request to allow users to prevent this and it has already been merged.

The zxing-cpp library will also try to guess the encoding of binary data in QR codes. The comments suggest that the QR code specification requires that decoders pick an encoding without specifying a default or allowing them to return the raw binary data. In order to make that possible, the binary data is copied to a byte array which can be accessed through the DecoderResult. When I have some free time, I intend to write zximg and zxcam tools with binary decoding support for this library.

It's always possible to encode binary data as base 64 and encode the result as an alphanumeric QR code. However, base 64 encoding will increase the size of the data and the alphanumeric mode doesn't allow use of the QR code's maximum capacity. In a comment, you mentioned what you intend to use binary QR codes for:

I want to have a package to effectively dump some gpg stuff in a format that makes recovery easy.

That is the exact use case I'm attempting to enable with my pull request: an easier-to-restore paperkey. 4096 bit RSA secret keys can be directly QR encoded in 8 bit mode but not in alphanumeric mode as base 64-encoded data.

Community
  • 1
  • 1
Matheus Moreira
  • 17,106
  • 3
  • 68
  • 107
  • 2
    Yes, agree 100%, I also want a 'better paperkey' :) – Zorglub29 Mar 04 '20 at 06:31
  • 1
    Btw @MatheusMoreira I am working on my pure bash little 'qr code paperkey'. Hope to have some code that looks like something within a few days, I will let you know then. – Zorglub29 Mar 04 '20 at 13:00
  • @Zorglub29 If my patch is merged, a key restoration script will probably be as simple as `zbarcam --raw --oneshot --Sbinary | paperkey --pubring public.gpg | gpg --import`. I was planning to add these exact instructions to [the Arch Wiki's paperkey article](https://wiki.archlinux.org/index.php/Paperkey) after the fearure is available in the repositories. – Matheus Moreira Mar 04 '20 at 14:42
  • Yes, I agree for this specific case that just creating one qr-code will be enough, and this will be an extremely welcome addition. But when thinking about it a few days ago, I came to the conclusion that having a 'ture' qr-code dump bash package would be welcome. So that one can dump also a bit larger things - like, the full content of my password manager with ```pass```, or a full message that I may want to send by the post, or something like that :) – Zorglub29 Mar 04 '20 at 15:09
  • @Zorglub29 You could split binary data into multiple parts and encode each part as a separate QR code. If you decode them in the correct order, the result should be the original file. You could theoretically QR encode files of any size this way. Several bar code formats support structured append mode which helps the decoder figure out the correct order of each bar code in the sequence. I've never actually seen anyone use this feature though. There's code in the decoders to handle structured append but I don't know how robust it is. I've never tested it either. – Matheus Moreira Mar 04 '20 at 17:39
  • Yes, this is exactly what I am working on :) I will put my scripts for that in an own repo / try to set up a simple package. I let you know when it starts to take form :) – Zorglub29 Mar 04 '20 at 17:42
  • 1
    @Zorglub29 By the way, the `--oneshot` feature is useful even if you're trying to read multiple bar codes in a sequence. When you run `zbarcam` and place a QR code in front of the camera, there's a chance it will decode that QR code multiple times. This will result in the data being duplicated in the output, corrupting the file you're trying to restore. One shot mode forces `zbarcam` to terminate after reading exactly one bar code, allowing you to program in a time out before your script tries to read the next one. This gives you time to prepare the next bar code. – Matheus Moreira Mar 04 '20 at 17:45
  • Thanks. I was thinking about having a (very simple) binary metadata field to take care of all of that :) . – Zorglub29 Mar 04 '20 at 17:53
  • 1
    I start to work on the 'bash package' to help with dumping in a series of qr-codes here: https://github.com/jerabaul29/qrdump . It is still very primitive / ugly / dyssfunctional, but I hope it will be better within a couple of weeks. I let you know. – Zorglub29 Mar 05 '20 at 16:17
  • 1
    I think that qrdump starts to look like something reasonable. Feel free to comment on API etc, now that I have a working example the plan is to 1) stabilize API 2) refactor the inner code. – Zorglub29 Mar 10 '20 at 09:27
  • Very interesting - I did write a modern version of a tool I've developed for internal use that uses encrypted QR codes for secrets backup: https://github.com/yawn/offkey - I think it might make a lot of sense to at least support the option of using binary encodings directly. This will reduce recovery options but enable larger secrets. Thanks for the PR! – yawn Jul 29 '22 at 08:53
2

See also: Storing binary data in QR codes

Look like zbarimg is only supporting printable characters and adding a newline

printf '%s' 'Hello World!' >data.bin
xxd data.bin
qrencode -l H -8 -o data.png -r data.bin
zbarimg --raw --quiet data.png | xxd

I think a better more portable option would be to base64 encode your binary data before qr encoding.

Like this:

dd if=/dev/urandom bs=10 count=1 status=none > data.bin
xxd data.bin
base64 <data.bin | qrencode -l H -8 -o data.png
zbarimg --raw --quiet data.png | base64 -d | xxd
Léa Gris
  • 17,497
  • 4
  • 32
  • 41
  • Yes, I was aware of the newline, I guess I could have written that this was not a problem to me. So you think the problem is with zbarimg and not qrencode, sounds loke? – Zorglub29 Mar 03 '20 at 12:21
  • This works, many thanks :) A small question: do you know what is the explanation 'deep down' for the need to base64 encode here? :) – Zorglub29 Mar 03 '20 at 12:24
  • 1
    See: https://stackoverflow.com/questions/37996101/storing-binary-data-in-qr-codes – Léa Gris Mar 03 '20 at 12:26
  • Aah but wait, now this takes twice as much space :( . I want to dump quite large things, this will not work for my part. Sorry, this means I de-accept your answer for now, I would really like to have a true 'optimal' binary qr-code. – Zorglub29 Mar 03 '20 at 12:31
  • 1
    The generated binary qr-code is correct. The fault is at `zbar` decoding incorrectly as character string. You will have portability issues with your binary qr-code as lots of decoders do not handle binary correctly. See the post I linked that explain it can be done with patched `zbarimg`. I don't know your implementation plan, but if you have so much data that it won't fit a decently sized qr-code, you probably should qr code a record ID instead and retrieve the long data from the record ID in a database. – Léa Gris Mar 03 '20 at 12:36
  • 2
    Ok, thanks. I let it open a bit longer in the hope that somebody knows an answer / maybe another decode. I will consider to have a look at the decode too and see if I can modify it a bit. I want to have a package to effectively dump some gpg stuff in a format that makes recovery easy. There are some solutions that work 'up to an extent' but not exactly as I want either. This becomes quickly a few kB so size is important. – Zorglub29 Mar 03 '20 at 12:40
  • It looks like zbarimg is a dead project with no more development taking place. Do you know something about this @leagris ? – Zorglub29 Mar 03 '20 at 12:49
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/208923/discussion-between-zorglub29-and-lea-gris). – Zorglub29 Mar 03 '20 at 13:09
  • An alphanumeric QR code containing base 64-encoded data is most reliable and compatible since most decoders are designed to work with text. It should be noted that this method provides a lower capacity compared to directly encoding binary data in an 8 bit QR code. For example, a 4096 bit RSA secret key fits directly in an 8 bit QR code but it doesn't fit in an alphanumeric QR code as base 64-encoded data. – Matheus Moreira Mar 04 '20 at 01:54