0

I loaded mp3 file in python with torchaudio and librosa

import torchaudio
import librosa

filename='example.mp3'
array_tor, sample_rate_tor = torchaudio.load(filename,format='mp3')
array_lib, sample_rate_lib = librosa.load(filename, sr=sample_rate_tor)
print( len(array_tor.numpy()[0]) , len(array_lib)) # get different value

the length of two arrays are different, why makes them different, and how to make them in same?

if I convert example.mp3 to wav file with

from pydub import AudioSegment
audSeg = AudioSegment.from_mp3('example.mp3')
audSeg.export('example.wav', format="wav")

and load wav file with torchaudio , librosa, soundfile

import torchaudio
import librosa
import soundfile as sf
filename='example.wav'
array_tor_w, sample_rate_tor_w = torchaudio.load(filename,format='wav')
array_lib_w, sample_rate_lib_w = librosa.load(filename, sr=sample_rate_tor_w)
array_sfl_w, sample_rate_sfl_w = sf.read(filename)
print( len(array_tor_w.numpy()[0]) , len(array_lib_w), len(array_sfl_w)) # get same value

the three array length and content are same and also same as len(array_lib) in mp3 file.

it seems the torchaudio.load() is special in mp3 file.

aaaaa
  • 1
  • 1
  • `.wav` is a full-fidelity (i.e. lossless) sound file format; I would expect a loaded .wav file to be treated exactly the same in every player. Whereas `.mp3`, being a compressed, lossy format, can be interpreted differently from player to player. Each player could have its own internal representation of the .mp3. You can even have different sized .mp3's for the same song, due to the compression settings used. – Robert Harvey Apr 14 '22 at 12:07
  • See https://en.wikipedia.org/wiki/Lossy_compression – Robert Harvey Apr 14 '22 at 12:09

1 Answers1

2

This is due to the underlying decoder library torchaudio uses.

Up util v0.11, torchaudio used libmad, which does not remove the extra padding when decoding MP3.

See https://github.com/pytorch/audio/issues/1500 for the detail.

In v0.12, torchaudio switched MP3 decoder to FFmpeg, and the padding issue should be resolved.

moto
  • 166
  • 1
  • 3