0

Within my Beat-Detection, I'm using a Fast Fourier Transformation to detect the bass within an audio Signal. Im recording a solo basedrum, not moving the sound or volume. After plotting the values over time. I get non constant values. They differ very strong. Maybe you got an idea why this happens? I can only guess but maybe Im not using the right Buffersize or WindowSize for the FFT?

Down a plotted graphic and the source code

private class RecordingThread extends Thread {

    private boolean mShallContinue = true;

    @Override
    public void run() {
// Compute the minimum required audio buffer size and allocate the
    // buffer.
    mBufferSize = 4096;// AudioRecord.getMinBufferSize(SAMPLING_RATE,
                        // //4096;//
                        // AudioFormat.CHANNEL_IN_MONO,

    mAudioBuffer = new short[1024];// [mBufferSize / 2];
    bufferDouble2 = new int[mBufferSize / 2];
    bufferDouble = new int[(blockSize - 1) * 2];
    camera = Camera.open();

}
        AudioRecord record = new AudioRecord(
                MediaRecorder.AudioSource.DEFAULT, SAMPLING_RATE,
                AudioFormat.CHANNEL_IN_MONO,
                AudioFormat.ENCODING_PCM_16BIT, mBufferSize);

        short[] buffer = new short[blockSize];
        double[] audioDataDoubles = new double[(blockSize * 2)];
        double[] re = new double[blockSize];
        double[] im = new double[blockSize];
        double[] magnitude = new double[blockSize];

        // start collecting data
        record.startRecording();

        DoubleFFT_1D fft = new DoubleFFT_1D(blockSize);
        synchronized (this) {
            while (shallContinue()) {

                /** decibels */
                record.read(mAudioBuffer, 0, 1024);
                // updateDecibelLevel();

                /** frequency */
                // /windowing!?
                for (int i = 0; i < mAudioBuffer.length; i++) {
                    bufferDouble2[i] = (int) mAudioBuffer[i];
                }

                for (int i = 0; i < blockSize - 1; i++) {
                    double x = -Math.PI + 2 * i * (Math.PI / blockSize);
                    double winValue = (1 + Math.cos(x)) / 2.0;
                    bufferDouble[i] = (int) (bufferDouble2[i] * winValue);
                }

                int bufferReadResult = record.read(buffer, 0, blockSize);

                // Read in the data from the mic to the array
                for (int i = 0; i < blockSize && i < bufferReadResult; i++) {
                    audioDataDoubles[2 * i] = (double) buffer[i] / 32768.0; // signed
                                                                            // 16
                                                                            // bit
                    audioDataDoubles[(2 * i) + 1] = 0.0;
                }

                // audiodataDoubles now holds data to work with
                fft.complexForward(audioDataDoubles); // complexForward

                for (int i = 0; i < blockSize; i++) {

                    // real is stored in first part of array
                    re[i] = audioDataDoubles[i * 2];
                    // imaginary is stored in the sequential part
                    im[i] = audioDataDoubles[(i * 2) + 1];

                    // magnitude is calculated by the square root of
                    // (imaginary^2 + real^2)
                    magnitude[i] = Math.sqrt((re[i] * re[i])
                            + (im[i] * im[i]));
                }
                magnitude[0] = 0.0;

                magnitude2 = magnitude[2];
                magnitude3 = magnitude[3];
                magnitude4 = magnitude[4];

                updateShortBuffer();
                bufferCount++;
                updateLongBuffer();

                // if (detectedRoomRMS == 200)
                updateFrequency();
                System.out.println(System.currentTimeMillis() + " M2: "
                        + magnitude2 + " M3: " + magnitude3 + " M4: "
                        + magnitude4 + " M5: " + magnitude[5] + " M10: "
                        + magnitude[10] + " M20: " + magnitude[20] + " M24: "
                        + magnitude[24] + " M48: " + magnitude[48] + " LONG20: "
                        + rms_Long_Buffer_five + " LONNG: "
                        + rms_Long_Buffer);
            }
            record.stop(); // stop recording please.
            record.release(); // Destroy the recording, PLEASE!
        }
    }

    /**
     * true if the thread should continue running or false if it should stop
     */
    private synchronized boolean shallContinue() {
        return mShallContinue;
    }

    /**
     * Notifies the thread that it should stop running at the next
     * opportunity.
     */
    private synchronized void stopRunning() {
        mShallContinue = false;
    }

}

// / post the output frequency to TextView
private void updateFrequency() {
    tvfreq.post(new Runnable() {

        String RoomRMS;
        String s;

        public void run() {

            if (RMSMessureDone == false) {
                String l = "..";
                String KK = "...";
                tvfreq.setTextColor(Color.WHITE);
                if ((rmsCounter > 10))
                    tvfreq.setText(KK); //
                else
                    tvfreq.setText(l);
            } else {
                BPM = round(BPM, 1);
                s = Double.toString(BPM);
                s = s + " bpm";
                tvfreq.setTextColor(Color.WHITE);
                tvfreq.setText((s));

                RoomRMS = Double.toString(detectedRoomRMS);
                tvdb.setText(RoomRMS);
            }
        }

    });

}
  • 1
    Your title says "constant input signal", but in the body you say your input is a solo bass drum, which would be a strongly time-varying signal. So why would you expect the output to be constant? – Jim Lewis Jan 14 '15 at 23:54
  • I am meaning, i messure the values of the basedrum, that has the same volume all time, but my FFT gets sometimes smaller, sometimes bigger values. To be precise: the same basedrum returns very different values within the FFT. why is that? – theholyfreq Jan 15 '15 at 13:11

1 Answers1

1

I imagine the discrepancies you see have to do with the relationship of the onset with the window used for the FFT.

Fundamentally, the approach you are using is the wrong one for this problem:

1: The nature of the signal: The signal from a bass drum (and by this I assume you probably mean a kick drum?), features a sharp onset (it's just been hit hard), with a rapid decay. The initial peak is incoherent with a wide bandwidth; it's essentially white noise. Whilst there will be plenty of low frequency content in there, it won't dominate. After the initial attack, the drum skin vibrates at its natural frequency, with much lower output than the initial peak.

2: Looking through the square window: You're currently applying a square window function to your samples. This is not a winning choice as it splatters energy into places you don't want it. The Hamming and Blackman windows are common choices with FFTs.

3: Resolution: The fundamental flaw with using an FFT is that it is windowed. The results of an DFT is simply contribution of each frequency bin over the period of the window. The window period limits your temporal resolution (you only know that an event with in the range of frequencies occurred somewhere with in the window). On the other hand, if you want meaningful results from the low frequency bins of the FFT, Nyquist's theory applies with respect to the frequency of the window relative to the signal measured. Let's say you sample at 44.1kHz, this means you need a 2048 point DFT if you want meaningful results at, say, 50Hz. Each window now has a period of 0.047s (or about 1/20s). This is your margin of error on each temporal measurement.

There are a variety of time-domain onset-detection algorithms out there that are commonly used for beat-detection. You might use a frequency-domain approach in tandem if you wanted to detect the likely source of a signal.

marko
  • 9,029
  • 4
  • 30
  • 46