7

I have a large object that is read from a binary file using struct.unpack and some of the values are character arrays which are read as bytes.

Since the character arrays in Python 3 are read as bytes instead of string (like in Python 2) they cannot be directly passed to json.dumps since "bytes" are not JSON serializable.

Is there any way to go from unpacked struct to json without searching through each value and converting the bytes to strings?

rovyko
  • 4,068
  • 5
  • 32
  • 44
  • So how would the binary data be represented in JSON? This is not so much a Python problem as a representation issue. Should the binary data be converted to base64? Be decoded as Latin-1? You still have to move towards a valid data structure that can be represented. – Martijn Pieters Nov 27 '17 at 12:51
  • I should have mentioned that I needed to write the JSON file as UTF-8. – rovyko Mar 12 '18 at 21:33
  • That doesn't make it any clearer. From the answer below I gather that you have bytes that *contain UTF-8 encoded text*. That's very different from a bytes object that contains, say, the data for a PNG image. – Martijn Pieters Mar 13 '18 at 07:58
  • Had you included some examples of the kind of data, it would have been a lot clearer what you were trying to do. It is the code in the accepted answer that made it clear here, not the text in your question. – Martijn Pieters Mar 13 '18 at 07:59

1 Answers1

9

You can use a custom encoder in this case. See below

import json

x = {}
x['bytes'] = [b"i am bytes", "test"]
x['string'] = "strings"
x['unicode'] = u"unicode string"


class MyEncoder(json.JSONEncoder):
    def default(self, o):
        if type(o) is bytes:
            return o.decode("utf-8")
        return super(MyEncoder, self).default(o)


print(json.dumps(x, cls=MyEncoder))
# {"bytes": ["i am bytes", "test"], "string": "strings", "unicode": "unicode string"}
Tarun Lalwani
  • 142,312
  • 9
  • 204
  • 265