It is inherent to the math of how an FFT is calculated that it will produce frequency "buckets" that are evenly sized and with a count that is equal to half the sample size and go up to a frequency that is half the sample rate. (An FFT actually produces buckets equal to the sample size, but Android's Visualizer goes ahead and dumps the second half before delivering the results because they contain a reflection of the first half, and so are not useful for visualization.)
There is going to be a very limited range of permitted capture sizes and capture rates based on hardware capabilities and plain old physics. Also, these two properties are inversely proportional. If your capture size is big, your capture rate has to be small. Audio is produced as a stream of evenly timed amplitudes (where the spacing is the samplingRate
). Suppose for simplicity the audio stream is at 1024 Hz only, producing 1024 amplitudes per second. If your capture rate is 1 per second, you are collecting all 1024 of those amplitudes each time you capture, so your capture size is 1024. If your capture rate is 2 per second, you are collecting 512 amplitudes on each capture, so your capture size is 512.
Note, I don't know for sure is if you set a capture size and it doesn't inversely match your capture rate used in setDataCaptureListener
, whether it ignores the size you set or actually repeats/drops data. I always use Visualizer.getMaxCaptureRate()
as the capture rate.
What you can do (and it won't be exact) is average the appropriate ranges, although I think you'll want to apply the log function to the magnitude before you average, or the results won't look great. You definitely need to apply a log function to the magnitudes at some point before visualizing them for a visualizer to make sense to the viewer.
So after selecting a capture size you can prepare ranges to use for collecting the results.
private val targetEndpoints = listOf(0f, 63f, 160f, 400f, 1000f, 2500f, 6250f, 16000f)
private val DESIRED_CAPTURE_SIZE = 1024 // A typical value, has worked well for me
private lateinit var frequencyOrdinalRanges: List<IntRange>
//...
val captureSizeRange = Visualizer.getCaptureSizeRange().let { it[0]..it[1] }
val captureSize = DESIRED_CAPTURE_SIZE.coerceIn(captureSizeRange)
visualizer.captureSize = captureSize
val samplingRate = visualizer.samplingRate
frequencyOrdinalRanges = targetEndpoints.zipWithNext { a, b ->
val startOrdinal = 1 + (captureSize * a / samplingRate).toInt()
// The + 1 omits the DC offset in the first range, and the overlap for remaining ranges
val endOrdinal = (captureSize * b / samplingRate).toInt()
startOrdinal..endOrdinal
}
And then in your listener
override fun onFftDataCapture(
visualizer: Visualizer,
fft: ByteArray,
samplingRate: Int
) {
val output = FloatArray(frequencyOrdinalRanges.size)
for ((frequencyOrdinalRange, i) in frequencyOrdinalRanges.withIndex) {
var logMagnitudeSum = 0f
for (k in ordinalRange) {
val fftIndex = k * 2
logMagnitudeSum += log10(hypot(fft[fftIndex].toFloat(), fft[fftIndex + 1].toFloat()))
}
output[i] = logMagnitudeSum / (ordinalRange.last - ordinalRange.first + 1)
}
// If you want magnitude to be on a 0..1 scale, you can divide it by log10(hypot(127f, 127f))
// Do something with output
}
I did not test any of the above, so there might be errors. Just trying to communicate the strategy.