Background I'm trying to validate audio data received over RTP for its accuracy when compared to original source. In my system the audio is played by embedded platform devices and sent out on network for other devices to capture and play. it's specs vary based on mono, stereo or surround audio but sample rate and bit specs are as per following.
I'm using .wav file for now which contains Sine wave of spec, 44.1 kHz, 440 frequency, 1-channel, 16-bit PCM data. I'm using Sine wave so that it is clean and easy to analyze.
using following Python code I could verify that both the files are same since the alignment.distance it gives is 0.0
import librosa
import librosa.display
import matplotlib.pyplot as plt
from numpy.linalg import norm
from dtw import dtw
# Loading audio files
y1, sr1 = librosa.load("./test-audio-files/1kHz_44100Hz_16bit_05sec.wav")
y2, sr2 = librosa.load("./test-audio-files/received.wav")
# Computing MFCC values
mfcc1 = librosa.feature.mfcc(y1, sr1)
mfcc2 = librosa.feature.mfcc(y2, sr2)
alignment = dtw(mfcc1.T, mfcc2.T)
print("The normalized distance between the two : ", alignment.distance) # 0 for similar audios
Query what I'm wondering is, how to validate the accuracy of the sine wave? The above solution would work as far as I make sure my source file is perfect and accurate. If the source file has any problem then also the above solution would claim they match. I'm playing with .wav file for now but it can be any file like mp3, mp4...
The following packet loss is what I am trying to detect:
Reference Downloaded the .wav file from https://www.mediacollege.com/downloads/