22

Could somebody explain the general purpose of the bytes type in Python 3, or give some examples where it is preferred over other data types?

I see that the advantage of bytearrays over strings is their mutability, but what about bytes? So far, the only situation where I actually needed it was sending and receiving data through sockets; is there something else?

greybeard
  • 2,249
  • 8
  • 30
  • 66
pinwheel
  • 321
  • 1
  • 2
  • 3

2 Answers2

12

Possible duplicate of what is the difference between a string and a byte string

In short, the bytes type is a sequence of bytes that have been encoded and are ready to be stored in memory/disk. There are many types of encodings (utf-8, utf-16, windows-1255), which all handle the bytes differently. The bytes object can be decoded into a str type.

The str type is a sequence of unicode characters. The str needs to be encoded to be stored, but is mutable and an abstraction of the bytes logic.

There is a strong relationship between str and bytes. bytes can be decoded into a str, and strs can be encoded into bytes.

You typically only have to use bytes when you encounter a string in the wild with a unique encoding, or when a library requires it. str , especially in python3, will handle the rest.

More reading here and here

Jtcruthers
  • 854
  • 1
  • 6
  • 23
  • 5
    Thanks, I know the difference between bytes and other types. I'm interested where they are used where strings can't be. – pinwheel Oct 09 '19 at 16:12
  • 2
    Updated my answer to address your comment after about 2 years. If you have any other comments, I'll be sure to get to them in 2023. – Jtcruthers Sep 08 '21 at 05:58
  • 1
    @pinwheel one example would a server and client sending messages to each other. The method `socket.recvfrom(bufsize[, flags])` returns (bytes, address) – OuttaSpaceTime Feb 08 '22 at 09:31
0

Sockets is a good case for using bytes. Also you should use bytes to read/write binary data, like image or audio, from a file or a web API. The str type is an immutable sequence of character, which is typically UTF8 encoded. Obviously if your data is not characters, then doing the UTF8 encoding on it will be inefficient and could cause bugs.

John Henckel
  • 10,274
  • 3
  • 79
  • 79