Combining 100's of mp4s without running out of ram. In order

Question

I have some code that is great for doing small numbers of mp4s, but at the 100th one I start to run out of ram. I know you can sequentially write CSV files, I am just not sure how to do that for mp4s. Here is the code I have:`11

This solution works:

from moviepy.editor import *
import os
from natsort import natsorted

L = []

for root, dirs, files in os.walk("/path/to/the/files"):
#files.sort()
files = natsorted(files)
for file in files:
    if os.path.splitext(file)[1] == '.mp4':
        filePath = os.path.join(root, file)
        video = VideoFileClip(filePath)
        L.append(video)

final_clip = concatenate_videoclips(L)
final_clip.to_videofile("output.mp4", fps=24, remove_temp=False)`

The code above is what I tried, I expected a smooth result on first glance, though it worked perfect on a test batch it could not handle the main batch.

J_H · Accepted Answer · 2023-01-02T18:38:57.380

2

You appear to be appending the contents of a large number of video files to a list. Yet you report that available RAM is much less than total size of those files. So don't accumulate the result in memory.

Follow one of these approaches:

keep an open file descriptor

        with open("combined_video.mp4", "wb") as fout:
            for file in files:
                ...
                video = ...
                fout.write(video)

Or perhaps it is fout.write(video.data) or video.write_segment(fout) -- I don't know about the video I/O library you're using.

The point is that the somewhat large video object is re-assigned each time, so it does not grow without bound, unlike your list L.

append to existing file

We can nest in the other order, if that's more convenient.

        for file in files:
            with open("combined_video.mp4", "ab") as fout:
                ...
                video = ...
                fout.write(video)

Here we're doing binary append. Repeated open / close is slighty less efficient. But it has the advantage of letting you do a run with four input files, then python exits, then later you do a run with pair of new files and you'll still find the expected half a dozen files in the combined output.

edited Jan 02 '23 at 18:38

answered Jan 02 '23 at 17:32

J_H

17,926
4
24
44

You my good sir are a saint, often the simplest solution is the best. – Dray Jan 02 '23 at 17:44
*"I don't know what video I/O library you're using"* - Isn't it `moviepy`, which they imported and tagged? – Kelly Bundy Jan 02 '23 at 18:08
@Dray So this works? Simply concatenating the files? Pretty surprising (though I don't know the format). – Kelly Bundy Jan 02 '23 at 18:12
Actually it didn't , `File "merger.py", line 17, in fout.write(video) TypeError: a bytes-like object is required, not 'VideoFileClip' but it will be fine – Dray Jan 02 '23 at 22:27
1

It was a simple fix though. Ill post finished code later – Dray Jan 02 '23 at 22:45
Good god though, it's slow. I'm just processing 1000 250mb-500mb files. Ideas anyone, already disabled audio, tried screwing with the thread parameter. – Dray Jan 02 '23 at 23:18
Though I do have a batch of 240 files that together add up to about 60 gigs worth... This sucks – Dray Jan 02 '23 at 23:20
Eh, "rewrote" everything in ffmpeg it just processed 60 gigs in less than an hour – Dray Jan 03 '23 at 01:05

Combining 100's of mp4s without running out of ram. In order

1 Answers1

keep an open file descriptor

append to existing file