0

I am currently new to this, so kindly keep it simple for me to understand.

I have a project in which I have to classify the voice as good, bad or neutral. My plan is to get all the frequencies and pitch of the sample data set and train them using SVM.

In order to get the pitch and frequency of all the .wav files. I did the code up to finding the PCM Data from a audio file. Now how should I apply these data to the Fast Fourier Transform Algorithm for getting frequencies? Are there more things to consider before applying the byte array to FFT algorithm?

Here is my code for the convertion of wav file to pcm byte array:

int totalFramesRead = 0;
File fileIn = new File(inputFile);
try {
    AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(fileIn);
    int bytesPerFrame = audioInputStream.getFormat().getFrameSize();
    if (bytesPerFrame == AudioSystem.NOT_SPECIFIED) {
        // some audio formats may have unspecified frame size
        // in that case we may read any amount of bytes
        bytesPerFrame = 1;
    }
    // Set an arbitrary buffer size of 1024 frames.
    int numBytes = 1024 * bytesPerFrame;
    byte[] audioBytes = new byte[numBytes];
    try {
        int numBytesRead = 0;
        int numFramesRead = 0;
        // Try to read numBytes bytes from the file.


        while ((numBytesRead = audioInputStream.read(audioBytes)) != -1) {
            // Calculate the number of frames actually read.
            numFramesRead = numBytesRead / bytesPerFrame;
            totalFramesRead += numFramesRead;
        }
        return audioBytes[];
    }
Alexey Subach
  • 11,903
  • 7
  • 34
  • 60
  • 1
    I think you need to take into account the number of bytes representing a single sample of audio. Most audio files these days will be 16-bits per sample. – john16384 Apr 02 '17 at 08:20
  • There are many similar questions on StackOverflow already, with good answers - try [searching for jtransforms+audio](http://stackoverflow.com/search?q=Jtransforms+audio). – Paul R Apr 02 '17 at 09:11
  • Your FFT library will probably require float—look at it’s input requirements and convert the wave data accordingly. – Ahmed Fasih Apr 02 '17 at 12:11

1 Answers1

0

There's a lot to consider after or other than an FFT, since FFT frequency peaks are not necessarily the pitch frequency. Look up pitch detection/estimation algorithms instead of just using a bare FFT magnitude.

hotpaw2
  • 70,107
  • 14
  • 90
  • 153