2

I’m trying to develop an application that is capable of identifying a sound clip of an animal. What I’m doing is that I’m taking in an AMR recording and reading the byte array from it and sending those data through FFT and calculate amplitudes accordingly.

AMR file sample frequency 8 KHz (Standard AMR of 15 seconds)

Number of FFT points 4096 for input of 8192 values

Then I calculate amplitude by amplitude=2 * FFT point value/8192

So my intention now is to get a spike at the frequency related to the highest amplitude, The issue is that the spike at the highest aplitude is not Consistant for the same animal's some other sound clip. For another sound clip the frequency related to the highest amplitude changes. Is there a reason for this?. Any help and guidance for this will be appreciated. Thanks in advance.

Community
  • 1
  • 1
  • 1
    "But this isn’t happening" is far too vague a description of the exact problem to admit useful answers. – NPE Dec 26 '11 at 09:24
  • thanks for the reply. The issue is that the spike at the highest aplitude is not Consistant for the same animal's some other sound clip. For another sound clip the frequency related to the highest amplitude changes. Is there a reason for this? – user1114638 Dec 26 '11 at 10:07
  • I updated the question please respond – user1114638 Dec 26 '11 at 12:56

1 Answers1

1

your file has a sample frequency of 8KHz, but I think that the average human hearing frequency is of somewhat 20KHz, so are you sure that your are respecting the nyquist frequency of your samples (.wav files usually have a sample rate of at least 48KHz)?

The nyquist frequency states that if you are to sample a given signal you must use a sample frequency that is at least twice the maximum frequency the given signal.

Also, the same animal can and will make different sounds, so your average frequency will never be the same for two different samples. Do you have a tolerance threshold that accounts for different average frequencies?

Felipe
  • 6,312
  • 11
  • 52
  • 70
  • thanks Komyg. No i dont. Some guidance to do that will be appreciated. The other issue is that the above mentioned frequency (8KHz) is the standard frequency for a AMR file. So how am i suppose to change that? I am open for reading new theories that i should know. – user1114638 Dec 26 '11 at 14:49
  • Well first of all by your description the AMR files are already sampled, so you shouldn't re-sample them. Also you cannot change the sampling frequency of your files, unless you have the original ones (the ones used to encode the AMR files). From what I've read around the internet the AMR is a standard for storing audio voice data, so I don't think that you may have any nyquist frequency problems, because this codec is apparently widely used, so I can only assume that someone thought of this problem while developing it. – Felipe Dec 26 '11 at 15:09
  • However from what I've read this codec highly compresses audio files, so it may be inadequate for you because you end up loosing a lot of useful data in the compression. Perhaps you may get better results using a file without any compression (.wav file for example). – Felipe Dec 26 '11 at 15:12
  • The application that i am developing is a mobile application. So my initial recording format is AMR. This is the problem. Is it possible to covert formats from AMR to Wav in a j2me environment? – user1114638 Dec 26 '11 at 15:23
  • I'm not sure if you can, however the important thing here is that once you've converted your files to the AMR format you have already lost a part of your data due to the compression, therefore even if you do convert your files to say a WAV or an AIFF format, you have already lost data from the first conversion so the quality of your files/samples will not get any better. To use a WAV file you need either the raw file (used to create the AMR file) or you need to re-record your sounds and then convert them to WAV. – Felipe Dec 26 '11 at 15:33
  • I will try the same process on wav files and i will get back to you if i get any issues. Thanks – user1114638 Dec 26 '11 at 15:39
  • Your suggestion was completely correct Komyg. Thank you. I got the spike that i needed. But now, how to get a pattern, that I can identify later seem like a “long shot”. Will anyone be able to propose an idea please. – user1114638 Dec 26 '11 at 17:59
  • You are welcome! As to recognizing the pattern, I don't have many suggestions to give you, usually this is quite a hard thing to do (as is any biometric algorithms in general)... You will probably need to find a correlation factor between a standard signal and your current sample and decide, based on a threshold, if they mach or no. You will probably need to do some research into signal processing and pattern recognition. – Felipe Dec 26 '11 at 18:19
  • what do you mean by the "standard signal". Is it a standard sound clip of the animal? – user1114638 Dec 26 '11 at 18:26
  • By "standard signal" I mean that you should have a standard for comparison, something like an average sound that your animals make... However I really don't know much about what you can do (my university days are long gone). Maybe you can ask another question with this specific issue (this one is marked as answered so I think that there won't be a lot of people looking at it). Or maybe you can seek some help from your (former) teachers at your university. – Felipe Dec 26 '11 at 19:41