4

What is the most space efficient charset for JSON (UTF-8/16/32) for use of base64 encoded binary data?

{ data: "jA0EAwMCxamDRMfOGV5gyZPnyX1BB" }
Sebastian Barth
  • 4,079
  • 7
  • 40
  • 59

1 Answers1

7

Base64 is ASCII, so if the bulk of your JSON is Base64-encoded data, the most space-efficient encoding will be UTF-8. UTF-8 encodes ASCII characters (code points 0000–007F) as one byte, whereas UTF-16 and UTF-32 encode them as two and four, respectively.

Furthermore, it's just a good idea to use UTF-8, because it's the default encoding for JSON and not all tools support other encodings. From RFC-7159:

8.1 Character Encoding

JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32. The default encoding is UTF-8, and JSON texts that are encoded in UTF-8 are interoperable in the sense that they will be read successfully by the maximum number of implementations; there are many implementations that cannot successfully read texts in other encodings (such as UTF-16 and UTF-32).

Jordan Running
  • 102,619
  • 17
  • 182
  • 182