I’m trying to develop an application that is capable of identifying a sound clip of an animal. What I’m doing is that I’m taking in an AMR recording and reading the byte array from it and sending those data through FFT and calculate amplitudes accordingly.
AMR file sample frequency 8 KHz (Standard AMR of 15 seconds)
Number of FFT points 4096 for input of 8192 values
Then I calculate amplitude by amplitude=2 * FFT point value/8192
So my intention now is to get a spike at the frequency related to the highest amplitude, The issue is that the spike at the highest aplitude is not Consistant for the same animal's some other sound clip. For another sound clip the frequency related to the highest amplitude changes. Is there a reason for this?. Any help and guidance for this will be appreciated. Thanks in advance.