1

I'm crawling across a folder of WAV files, with each file having the same sample-rate but different lengths. I'm loading these using Librosa and computing a range of spectral features on them. This results in arrays of different sizes due to the differing durations. Trying to then concatenate all of these arrays fails - obviously because of their different shapes, for example:

shape(1,2046)
shape(1,304)
shape(1,154)

So what I've done is before loading the files I use librosa to get the duration of each file and pack it into a list.

class GetDurations:

def __init__(self, files, samplerate):
    list = []
    self.files = files
    self.sampleRate = samplerate
    for file in self.files:
        list.append(librosa.get_duration(filename=file, sr=44100))

    self.maxFileDuration = np.max(list)

Then I get the maximum value of the list, to get the maximum possibly length of my array, and convert it to frames (which is what the spectral extraction features of Librosa work with)

        self.maxDurationInFrames = librosa.time_to_frames(self.getDur.maxFileDuration,
                                                      sr=44100,hop_length=512) + 1

So now I've got a value that I know will account for the longest duration of my input files. I just need to initialise my array with this length.

allSpectralCentroid = np.zeros((1, self.maxDurationInFrames))[1:]

This gives me an empty container for all of my extracted spectral centroid data for all WAV files in the directory. In order to add data to this array I later on do the following:

padValue = allSpectralCentroid.shape[1] - workingSpectralCentroid.shape[1]
workingSpectralCentroid = np.pad(workingSpectralCentroid[0], ((0, padValue)), mode='constant')[np.newaxis]
allSpectralCentroid = np.append(allSpectralCentroid, workingSpectralCentroid, axis=0)

This subtracts the length of the 'working' array from the 'all' array to get me a pad value. It then pads the working array with zeros to make it the same length as the all array. Finally it then appends the two (joining them together) and assigns this to the 'all' variable.

So.... My question is - is there a more efficient way to do this?

Bonus question - How do I do this when I 100% can never know the length required??

DrewTNBD
  • 41
  • 4

0 Answers0