I am converting a Python program to Node.js, the program follows these steps:
- Microphone listens with callbacks
- Callbacks do a Librosa "log_mel_S" extraction
- The "log_mel_S" is inferenced by an AI model
- Sound is labeled
I have managed to translate all of the steps and their relatives from Python to Node.js, except for the Librosa extraction. This would be an example for the audio shape and type required:
audio_sample = numpy.zeros(shape=(1024, 100), dtype=numpy.float32)
And this is the Librosa piece I need help translating:
S = numpy.abs(librosa.stft(y=audio_sample, n_fft=1024, hop_length=500)) ** 2
mel_S = numpy.dot(librosa.filters.mel(sr=44100, n_fft=1024, n_mels=64), S).T
log_mel_S = librosa.power_to_db(mel_S, ref=1.0, amin=1e-10, top_db=None)
I found this package Meyda, and it looks like it can be a good substitute, but I am not sure how I should approach this, it is unclear to me what is being extracted from Librosa, so I cannot identify the terms like Amplitude Spectrum
, Power Spectrum
, etc.
Please help me understand and translate this action.