How to load and resample (MP3) audio files faster in Python/Linux?

Question

Currently, I am trying to load 280,000 MP3 audio files in Python where the average duration of files is ~5 seconds. I am using Librosa for this purpose as well as for the further processing (e.g. computing spectrogram) in later stages.

However, I realized that loading the files is very slow, as on average it takes 370 milliseconds for each file to be loaded, uncompressed and re-sampled. If I turn off the re-sampling (i.e. librosa.load(..., sr=None)), it takes around 200 milliseconds but that's not still good considering the large number of files I have. Unsurprisingly, loading wav files without re-sampling is very fast (< 1 ms); but if we perform the re-sampling, it takes around 160 milliseconds.

Now I was wondering if there is any faster approach for doing this, whether directly in Python or using external tools in Linux with the condition that I can later load the results back to Python.

By the way, I have tried using multiprocessing with a pool of size 4 and achieved 2-3x speed-up, but I am looking for more (preferably > 10x).

Note: the original files are human voice and have a sample rate of 48KHz and a bit-rate of 64 Kbps; I want to downsample them to 16KHz.

@hendrik Thanks a lot! I tried `pysox` by downsampling and converting mp3 files to wav and on average it took 20 milliseconds for each file. Much better, even better than `ffmpeg` which I also tried and it took 100 milliseconds for the same operation. — today, Jul 23 '19 at 10:03
Maybe a silly question, but I can't see how to simply read the mp3 file without doing any conversion? — moinudin, Apr 12 '21 at 18:43
If by "conversion" you are only referring to resampling step, then it's possible to not perform resampling; however, for MP3 files (unlike wav files), I guess they should be at least uncompressed/decoded first to get the raw samples, so that step would be required at least (though, I am not an expert on that topic). — today, Apr 12 '21 at 18:56

Hendrik · Answer 1 · 2021-08-02T11:00:01.627

7

You could use pysox.

It's a thin Python wrapper around SoX, "the Swiss Army knife of sound processing programs."

Note: For faster processing (avoiding exec calls), you may also install and use soxbindings. All you need to do is to replace

import sox

with

import soxbindings as sox

edited Aug 02 '21 at 11:00

answered Jul 23 '19 at 10:08

Hendrik

5,085
24
56

How to load and resample (MP3) audio files faster in Python/Linux?

1 Answers1