5

According to the book Designing Data Intensive Applications, it says

whenever you want to send data over the network or write it to a file—you need to encode it as a sequence of bytes.

It then goes on to introduce JSON, XML as human readable format. It then says if you want more compact formats, you can try some binary encoding formats such as BSON, or PROTOBUF or THRIFT.

How does JSON (not BSON) encode data as a sequence of bytes if it's a human readable text?

user1008636
  • 2,989
  • 11
  • 31
  • 45
  • 1
    Human readable text is often encoded as bytes... For example, ASCII encodes the character "A" as the byte 0x41. – hft May 22 '18 at 01:15
  • @hft So what is the difference between JSON and BSON if they both end up as sequence of bytes when going across the network? – user1008636 May 22 '18 at 01:16
  • It's only human readable after a computer decodes it from bytes into something you can look at on a screen. Otherwise, like anything else in a computer, it's just bytes and stuff. So the difference between so-called "binary" and human-readable file formats is a lower level question. – Peter Ellis May 22 '18 at 01:17
  • @PeterEllis are you essentially saying that the difference between JSON and BSON (or Protobuf) is that on the raw level, one is a text, one is already binary, so when it goes across network, the JSON format gets further encoded into binary, while the other format just goes over as is. And the latter is more compact than the former? – user1008636 May 22 '18 at 01:31
  • @user1008636 the difference is *which* bytes. At a low level every thing is just a string of bits. For example, the "nop" instruction is the string of bits 10010000 in "binary" machine code. On the other hand, if I write "nop" in an editor like vi (say I'm editing assembly code), the string of characters "nop" is 011011100110111101110000 in "binary" ASCII encoding. The latter is "human readable" because text editors parse it as the string of characters "nop" where as text editors do not even display 10010000 since it is non-ascii. [This is just one example] – hft May 22 '18 at 21:21

0 Answers0