I am trying to analyze song audio using a python library, the output is a numpy array, the array is very large in size as the MFCC is calculated for every frame of the audio. When I write this output to a file , each song has an output of about 3-4MB. Is there a way to reduce the N frames of information into a single row of features?
)