4

I'm trying to extract the audio from a pytube video, then convert it into wav format. For extracting the audio from the video, I tried to use moviepy, but I can't find a way to open a video file from bytes with VideoFileClip. I don't want to keep saving files then reading them.

My attempt:

from pytube import YouTube
import moviepy.editor as mp

yt_video = BytesIO()
yt_audio = BytesIO()

yt = YouTube(text)
videoStream = yt.streams.get_highest_resolution()
videoStream.stream_to_buffer(yt_video) # save video to buffer

my_clip = mp.VideoFileClip(yt_video) # processing video 

my_clip.audio.write_audiofile(yt_audio) # extracting audio from video
MmBaguette
  • 340
  • 3
  • 13

1 Answers1

6

You can get the URL of the stream and extract the audio using ffmpeg-python.

ffmpeg-python module executes FFmpeg as sub-process and reads the audio into memory buffer. FFmpeg transcode the audio to PCM codec in a WAC container (in memory buffer).
The audio is read from stdout pipe of the sub-process.

Here is a code sample:

from pytube import YouTube
import ffmpeg

text = 'https://www.youtube.com/watch?v=07m_bT5_OrU'

yt = YouTube(text)

# https://github.com/pytube/pytube/issues/301
stream_url = yt.streams.all()[0].url  # Get the URL of the video stream

# Probe the audio streams (use it in case you need information like sample rate):
#probe = ffmpeg.probe(stream_url)
#audio_streams = next((stream for stream in probe['streams'] if stream['codec_type'] == 'audio'), None)
#sample_rate = audio_streams['sample_rate']

# Read audio into memory buffer.
# Get the audio using stdout pipe of ffmpeg sub-process.
# The audio is transcoded to PCM codec in WAC container.
audio, err = (
    ffmpeg
    .input(stream_url)
    .output("pipe:", format='wav', acodec='pcm_s16le')  # Select WAV output format, and pcm_s16le auidio codec. My add ar=sample_rate
    .run(capture_stdout=True)
)

# Write the audio buffer to file for testing
with open('audio.wav', 'wb') as f:
    f.write(audio)

Notes:

  • You may need to download FFmpeg command line tool.
  • The code sample is working but, but I am not sure how robust is it.
Rotem
  • 30,366
  • 4
  • 32
  • 65
  • How can I save it to memory without downloading anything? – MmBaguette Apr 24 '21 at 22:49
  • I am not sure I can find a solution... I have few questions: **1.** Does using the URL `yt.streams.all()[0].url` work correctly? **2.** Do you want to get both the audio and the video, or only the audio? **3.** Does the solution must use moviepy, or may use other packages like [ffmpeg-python](https://github.com/kkroening/ffmpeg-python)? – Rotem Apr 25 '21 at 08:10
  • It doesn't matter what packages I'm using. I'm just trying to extract only the audio in a .wav format. Your solution was only an alternative to another I already found. How can I do the same thing but in memory? – MmBaguette Apr 25 '21 at 16:20
  • I posted a solution that reads WAV into memory. – Rotem Apr 25 '21 at 19:19
  • This is terrific! Thank you so much. I converted the bytes result into a BytesIO stream. – MmBaguette Apr 25 '21 at 23:22