I am currently new to this, so kindly keep it simple for me to understand.
I have a project in which I have to classify the voice as good, bad or neutral. My plan is to get all the frequencies and pitch of the sample data set and train them using SVM.
In order to get the pitch and frequency of all the .wav files. I did the code up to finding the PCM Data from a audio file. Now how should I apply these data to the Fast Fourier Transform Algorithm for getting frequencies? Are there more things to consider before applying the byte array to FFT algorithm?
Here is my code for the convertion of wav file to pcm byte array:
int totalFramesRead = 0;
File fileIn = new File(inputFile);
try {
AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(fileIn);
int bytesPerFrame = audioInputStream.getFormat().getFrameSize();
if (bytesPerFrame == AudioSystem.NOT_SPECIFIED) {
// some audio formats may have unspecified frame size
// in that case we may read any amount of bytes
bytesPerFrame = 1;
}
// Set an arbitrary buffer size of 1024 frames.
int numBytes = 1024 * bytesPerFrame;
byte[] audioBytes = new byte[numBytes];
try {
int numBytesRead = 0;
int numFramesRead = 0;
// Try to read numBytes bytes from the file.
while ((numBytesRead = audioInputStream.read(audioBytes)) != -1) {
// Calculate the number of frames actually read.
numFramesRead = numBytesRead / bytesPerFrame;
totalFramesRead += numFramesRead;
}
return audioBytes[];
}