I have a couple of .wav sound files with very similar percussive signals of ~60ms duration. I can identify their onset times using libROSA's onset detection quite well. I would now like to extract the associated audio segments of ~60ms from the files using the onset times. Here is what I have done so far:
import librosa
import matplotlib.pyplot as plt
import numpy as np
x, sr = librosa.load("C:/data/test.wav")
onset_frames = librosa.onset.onset_detect(x, sr=sr, wait=1, pre_avg=1, post_avg=1, pre_max=1,
post_max=1)
print(onset_frames) # frame numbers of estimated onsets
onset_times = librosa.frames_to_time(onset_frames)
o_env = librosa.onset.onset_strength(x, sr=sr)
times = librosa.frames_to_time(np.arange(len(o_env)), sr=sr)
onset_frames = librosa.util.peak_pick(o_env, 10, 10, 10, 10, 2, 60)
D = np.abs(librosa.stft(x))
plt.figure(figsize=(15,10))
ax1 = plt.subplot(2, 1, 1)
librosa.display.specshow(librosa.amplitude_to_db(D, ref=np.max),
x_axis='time', y_axis='log')
plt.title('Power spectrogram')
plt.subplot(2, 1, 2, sharex=ax1)
plt.plot(times, o_env, label='Onset strength')
plt.vlines(times[onset_frames], 0, o_env.max(), color='r', alpha=0.9, linestyle='--',
label='Onsets')
plt.axis('tight')
plt.legend(frameon=True, framealpha=0.75)
plt.show()
print(onset_times)
If I use the following code to extract 100ms segments after the onset (with backtracking), I do not get the right segments:
onsets = librosa.onset.onset_detect(x, backtrack = True, units = 'time')
onsetsnb = librosa.onset.onset_detect(x, units = 'time')
i = 1
for onset in onsetsnb:
current_sample, sr = librosa.load("C:/data/test.wav", offset = onset, duration = .100)
sf.write(f'./clicks/clicks{i}.wav', current_sample, sr)
i+=1
print("complete")
I would like to know how to use "onset_frames" or "onset_times" that can be adjusted by editing the peak picking parameters in the function "librosa.util.peak_pick" to extract the 28 segments. Could anyone give me a hint here?