subprocess "TypeError: a bytes-like object is required, not 'str'"

Question

I'm using this code from a previously asked question a few years ago, however, I believe this is outdated. Trying to run the code, I receive the error above. I'm still a novice in Python, so I could not get much clarification from similar questions. Does anyone know why this is happening?

import subprocess

def getLength(filename):
  result = subprocess.Popen(["ffprobe", filename],
    stdout = subprocess.PIPE, stderr = subprocess.STDOUT)
  return [x for x in result.stdout.readlines() if "Duration" in x]

print(getLength('bell.mp4'))

Traceback

Traceback (most recent call last):
  File "B:\Program Files\ffmpeg\bin\test3.py", line 7, in <module>
    print(getLength('bell.mp4'))
  File "B:\Program Files\ffmpeg\bin\test3.py", line 6, in getLength
    return [x for x in result.stdout.readlines() if "Duration" in x]
  File "B:\Program Files\ffmpeg\bin\test3.py", line 6, in <listcomp>
    return [x for x in result.stdout.readlines() if "Duration" in x]
TypeError: a bytes-like object is required, not 'str'

Martijn Pieters · Accepted Answer · 2019-04-05T14:47:43.997

subprocess returns bytes objects for stdout or stderr streams by default. That means you also need to use bytes objects in operations against these objects. "Duration" in x uses str object. Use a bytes literal (note the b prefix):

return [x for x in result.stdout.readlines() if b"Duration" in x]

or decode your data first, if you know the encoding used (usually, the locale default, but you could set LC_ALL or more specific locale environment variables for the subprocess):

return [x for x in result.stdout.read().decode(encoding).splitlines(True)
        if "Duration" in x]

The alternative is to tell subprocess.Popen() to decode the data to Unicode strings by setting the encoding argument to a suitable codec:

result = subprocess.Popen(
    ["ffprobe", filename],
    stdout=subprocess.PIPE, stderr = subprocess.STDOUT,
    encoding='utf8'
)

If you set text=True (Python 3.7 and up, in previous versions this version is called universal_newlines) you also enable decoding, using your system default codec, the same one that is used for open() calls. In this mode, the pipes are line buffered by default.

Maybe point out the `universal_newlines=True` aka `text=True` in Python 3.7+ which causes Python to decode the output as text in the system's default encoding and return a string. — tripleee, Apr 01 '19 at 17:42
The encoding argument of Popen is available from Python 3.6, in previous version (Python 3.5 in my case), you must precise the encoding when doing byte conversion (`bytes("Duration", encoding='utf8')`) — adn05, Apr 05 '19 at 14:19

score 5 · Answer 2 · answered Jul 08 '17 at 19:16

5

Like the errror says, "Duration" is a string. Whereas, the X is a byte like object as results.stdout.readlines() reads the lines in the output as bytecode and not string.

Hence store "Duration" in a variable, say str_var and encode it into a byte array object using str_var.encode('utf-8').

Refer to [this][1].

[1] : Best way to convert string to bytes in Python 3?

answered Jul 08 '17 at 19:16

Harshith Thota

856
8
20

It's just a literal, just prefix it with `b`. You don't need to store the string in a variable to be able to encode it either, `"Duration".encode('utf-8')` works too (but is a waste of computer cycles if you can just make it a bytes object to begin with). – Martijn Pieters Jul 08 '17 at 19:17
Well, if he wants to use it for multiple files, it's better to store it in a variable. Now, mind explaining why a downvote for that? – Harshith Thota Jul 08 '17 at 19:19
Why? A string literal is stored as a constant with the code object anyway, and where are they mentioning multiple files? – Martijn Pieters Jul 08 '17 at 19:20
Note that the test is done in a loop, using a literal is *better there* because that loads a constant, rather than having to look up a variable each time. – Martijn Pieters Jul 08 '17 at 19:20
Fair enough but still doesn't explain the downvote. It's not a wrong answer. – Harshith Thota Jul 08 '17 at 19:22

subprocess "TypeError: a bytes-like object is required, not 'str'"

2 Answers2

Linked

Related