1

I'm using ffmpeg-python to fetch streams from a video and write some parameters (codec_name, resolution, etc.) for each stream into csv.

video = 'test.mkv'
probe = ffmpeg.probe(video)
video_stream = next((stream for stream in probe['streams'] if stream['codec_type'] == 'video'), None)
print(video_stream['codec_long_name'])
audio_stream = next((stream for stream in probe['streams'] if stream['codec_type'] == 'audio'), None)
...

My problem is that it works well for a video stream, but not for multiple audio (or subtitles) streams. If the video has several audio streams it returns only one audio stream.

I've tried another approach, but it returns some streams 2-3 times and I get duplicates. So if the video sample has 4 audio tracks, I end up with ~9 audio streams instread of 4.

audio_streams = []
for audio in (probe['streams']):
    if (audio['codec_type'] == 'audio'):
        audio_streams.append(audio)
        pprint(audio_streams)

All other ideas I tried don't work, I'm new to programming and I'm stuck with it. How can I get all audio streams from a file without duplicates?

Apollo
  • 31
  • 5
  • How many audio stream do you count for `six_audio_streams.mp4`? Create it using the following shell command: `ffmpeg -y -f lavfi -i testsrc=size=192x108:rate=1 -f lavfi -i sine=frequency=100 -f lavfi -i sine=frequency=500 -f lavfi -i sine=frequency=600 -f lavfi -i sine=frequency=700 -f lavfi -i sine=frequency=800 -f lavfi -i sine=frequency=900 -map 0 -map 1 -map 2 -map 3 -map 4 -map 5 -acodec aac -ar 22050 -ac 1 -t 10 six_audio_streams.mp4` – Rotem Jul 27 '22 at 20:53
  • If I use the first approach `audio_stream = next((stream for stream in probe['streams'] if stream['codec_type'] == 'audio'), None)` for six_audio_streams.mp4 it prints one stream (with 'index': 1). If I use the second approach - I got 15 :the first stream (with 'index': 1) was randomly printed 5 times, the second stream - 4 times, the third - 3, etc. So, it looks like smth is wrong with the for loop just can't get it. – Apollo Jul 28 '22 at 06:53
  • It's not reproducible... Using the second approach there are exactly 6 prints. I don't know what is `pprint`. Try replacing `pprint(audio_streams)` with `print('$')` and count the dollars. There is a chance that something is wrong with your setup, but it hard to tell what is wrong. – Rotem Jul 28 '22 at 07:53
  • So, I've tried `print('$')` and experimented with ot a bit and finally solved the puzzle. And printing the `audio` returns the exact streams I want. `for audio in (probe['streams']): if (audio['codec_type'] == 'audio'): print(audio)` Thanks! – Apollo Jul 28 '22 at 09:55

0 Answers0