I am using pydub to do some experiments with an audio file. After I load it I want to take some parts and analyze them further with numpy, so I extract the raw data as explained here Pydub raw audio data
song = AudioSegment.from_file("dodos_delight.m4a")# Size of segments to break song into for volume calculations
# Take the first 17.5 seconds
start = 0
end = 17.5*1000
guitar = song[start:end]
#Now get raw data as an array
bit_depth = song.sample_width * 8
array_type = get_array_type(bit_depth)
fs = song.frame_rate
guitar_np = array.array(array_type, guitar.raw_data)
guitar_t = np.arange(0,len(guitar_np)/fs,1/fs)
However len(guitar_np)/fs = 35
which does not make sense. It's the exact double of what it should be. The only way to be 17.5 would be if fs was doubled, but the points are taken at 1/fs
time apart.
If I try to save the data like this
from scipy.io.wavfile import write
rate = fs
scaled = np.int16(guitar_np / np.max(np.abs(guitar_np)) * 32767)
write('test.wav', rate, scaled)
I get a super slow version of it, and the only way to make it sound as the original is to save it with rate = fs*2
Any thought?