1

I have a dataset of thousands of bird chirps audios (mp3) and I try to load them using librosa.load()

MP3 files are loaded but, most of the time, resulting data is an empty np.ndarray instead of a np.ndarray filled with floats

Using pydub.utils.mediainfo() I wanted to compare MP3 metadata. This function return information such as sampling_rate, codec, duration, bitrate, start_time, ...

I found out that start_time information was the explanation of failed loadings. Indeed, every file where start_time is 0 are not loaded correctly. At the contrary every file where start_time is over 0 are loaded correctly.

I have no problem listening every single MP3 file using VLC audio player.

Is there anything that can explain this behavior? Is there any solution to make these loadings succeed?

Clément
  • 1,128
  • 7
  • 21
  • What happens when start_time is a very small, but non-zero value. LIke 0.1, 0.01 etc? If you can provide a minimum failing example, then this is probably something you should file as a bug with librosa. Along with other info to reproduce, like the versions of relevant libraries, OS etc. – Jon Nordby Nov 21 '22 at 19:36

1 Answers1

0

I had the same very specific error. The error message I was getting was "Input signal length=0 is too small to resample from 48000->22050", which was because librosa was loading empty arrays in the same circumstances as you mention.

My workaround for it was to specify a duration parameter, in this case I set it to the full length of the file:

dur = pydub.utils.mediainfo(filepath)["duration"]

data, sr = librosa.load(filepath,  duration = math.floor(float(dur)))

This solved the empty arrays for me