I am attempting to preprocess audiofiles to be used in a neural net with soundfile.read()
, but the function is formatting the returned data differently for different .FLAC files with the same sample rate and length. For example, calling data, sr = soundfile.read(audiofile1)
produced an array with shape data.shape = (48000, 2)
(where individual element values were either the amplitude, 0, or the negative amplitude in NumPy float64), while calling data, sr = soundfile.read(audiofile2)
produced an array with shape data.shape = (48000,)
(where individual element values were varied NumPy float64).
Also, if it helps, audiofile1
was a recording taken from a recording taken via PyAudio, whereas audiofile2
was a sample from the LibriSpeech corpus.
So, my question is twofold:
Why is soundfile.read()
producing two different data formats, and how do I ensure that the function returns the arrays in the same format in the future?