need to understad how AudioRecord and AudioTrack work for raw PCM capture and playback

Question

I use the following code in a Thread to capture raw audio samples from the microphone and play it back through the speaker.

public void run(){
            short[] lin = new short[SIZE_OF_RECORD_ARRAY];
            int num = 0;
            // am = (AudioManager) this.getSystemService(Context.AUDIO_SERVICE); // -> MOVED THESE TO init()
            // am.setMode(AudioManager.MODE_IN_COMMUNICATION);
            record.startRecording();
            track.play();
            while (passThroughMode) {
            // while (!isInterrupted()) {
                num = record.read(lin, 0, SIZE_OF_RECORD_ARRAY);
                for(i=0;i<lin.length;i++)
                    lin[i] *= WAV_SAMPLE_MULTIPLICATION_FACTOR; 
                track.write(lin, 0, num);
            }
            // /*
            record.stop();
            track.stop();
            record.release();
            track.release();
            // */
        }

where record is an AudioRecord and track is an Audiotrack. I need to know in detail (and in a simplified way if possible) how the AudioRecord stores PCM data and AudioTrack plays PCM data. This is how I have understood it so far:

enter image description here

As the while() loop is continuously running, record obtains SIZE_OF_RECORD_ARRAY number of samples (which is 1024 for now) as shown in the figure. The samples get saved contiguously in the lin[] array of shorts (16 bit shorts, as I am using 16 bit PCM encoding). This is done by record.read(). Then track.write() places these samples in the speaker which is played by the hardware. Is this correct or am I missing something here?

score 1 · Answer 1 · answered Aug 21 '13 at 08:24

1

As for how the samples are laid out in memory; they're just arrays of linear approximations to a sound wave, taken at discrete times (like your figure shows). In the case of stereo, the samples will be interleaved (LRLRLRLR...).

When it comes to the path the audio takes, you're essentially right, although there are a few more steps involved:

Writing data to your Java AudioTrack causes it to make a JNI (Java Native Interface) call to a native helper class, which in turn calls the native AudioTrack class.
The AudioTracks are owned by the AudioFlinger, which periodically takes data from all the AudioTracks on a given output thread (which have been mixed by the AudioMixer) and writes it to the audio HAL output stream class.
From there the data goes to the user-space ALSA library, and through a couple of intermediate steps to the kernel-space PCM driver. Then further on from there; typically going through some kind of DSP that applies various acoustic compensation filters, and eventually making it's way to the hardware codec, which controls the speaker DAC and amplifiers.

When recording from the internal microphone(s) you'd have more or less the same steps, except that they'd be done in the opposite order.

Note that some of these steps (essentially everything from the audio HAL and below) are platform-specific, and therefore might differ between platforms from different vendors (and even different platforms from the same vendor).

answered Aug 21 '13 at 08:24

Michael

57,169
9
80
125

I am sorry but what did you mean by `typically going through some kind of DSP that applies various acoustic compensation filters,`? Does this mean the digital representation of the microphone input isn't exactly what is on the microphone? Is it already processed in some way through hardware? – user13267 Aug 21 '13 at 08:42
Again, changing the `SIZE_OF_RECORD_ARRAY ` doesn't seem to produce any variation in the output. I thought making this value very small would produce some kind of jitters or frame skip effects on the output, but it doesn't seem to do anything at all – user13267 Aug 21 '13 at 08:43
_"Is it already processed in some way?"_. Correct. Typical filters that would be applied when recording audio are Automatic Gain Control or Dynamic Range Compression, and Noise Suppression. – Michael Aug 21 '13 at 08:45
You can query the AudioRecord for the minimum buffer size using the `getMinBufferSize` method. – Michael Aug 21 '13 at 08:46
Please, let me try to explain it in a way I understand, and please correct me if I am wrong. So what this means is that, if I record a pure tone for example, using my android device, and at the same time record it on a computer at the same sampling frequency and 16 bit PCM encoding and save the result as a wav file, then compare the hex values in the files representing the sound data from both the wav files, they will be different because the android system already applied some filters to the recording? – user13267 Aug 21 '13 at 08:53
I was thinking of implementing some noise reduction algorithms that suppresses background noise when people are having a conversation. But if the device already performs such algorithms from hardware, won't it affect the way I have to do my implementations? I have implemented such algorithms in C on a pc, and a prerecorded wav file, but in that I guess there weren't any hardware level filters applied before I obtained the wav data file. How will the preapplied filters affect me, if I want to do the same thing, but on an android device? – user13267 Aug 21 '13 at 08:56
It's quite likely that you won't get a bit-exact match for recordings made on two different devices. But there could be any number of reasons for that (you have to consider the environment you're recording in, the form factors of the devices, the type of microphone used, the type of ADC, etc). As for how you'd go about implementing your app, I don't know. The exact combination of filters used and how they are tuned varies between different products, since their acoustic properties vary (because of the reasons listed above). – Michael Aug 21 '13 at 09:02

need to understad how AudioRecord and AudioTrack work for raw PCM capture and playback

1 Answers1

Linked