I'm building a fairly simple Android app (sdk revision 14: ICS) which allows users to pick two audio clips at a time (all are RIFF/WAV format, little-endian, signed PCM-16 bit encoding) and combine them in various ways to create new sounds. The most basic method I'm using for this combination is as follows:
//...sound samples are read in to memory as raw byte arrays elsewhere
//...offset is currently set to 45 so as to skip the 44 byte header of basic
//RIFF/WAV files
...
//Actual combination method
public byte[] makeChimeraAll(int offset){
for(int i=offset;i<bigData.length;i++){
if(i < littleData.length){
bigData[i] = (byte) (bigData[i] + littleData[i]);
}
else{
//leave bigData alone
}
}
return bigData;
}
the returned byte array can then be played via the AudioTrack class thusly:
....
hMain.setBigData(hMain.getAudioTransmutation().getBigData()); //set the shared bigData
// to the bigData in AudioTransmutation object
hMain.getAudioProc().playWavFromByteArray(hMain.getBigData(), 22050 + (22050*
(freqSeekSB.getProgress()/100)), 1024); //a SeekBar allows the user to adjust the freq
//ranging from 22050 hz to 44100 hz
....
public void playWavFromByteArray(byte[] audio,int sampleRate, int bufferSize){
int minBufferSize = AudioTrack.getMinBufferSize(sampleRate,
AudioFormat.CHANNEL_CONFIGURATION_MONO, AudioFormat.ENCODING_PCM_16BIT);
AudioTrack at = new AudioTrack(AudioManager.STREAM_MUSIC, sampleRate,
AudioFormat.CHANNEL_CONFIGURATION_MONO, AudioFormat.ENCODING_PCM_16BIT,
minBufferSize, AudioTrack.MODE_STREAM);
int i = 0;
at.play();
at.write(audio, 0, audio.length);
at.stop();
at.release();
for(i=0;i<audio.length;i++){
Log.d("me","the byte value at audio index " + i + " is " + audio[i]);
}
}
The result of a combination and playback using the code above is close to what I want (both samples are still discernible in the resulting hybridized sound) but there are also a lot of cracks, pops, and other noise.
So, three questions: First, am I using AudioTrack correctly? Second, where is endianness accounted for in the AudioTrack configuration? The sounds play fine by themselves and sound almost like what I would expect when combined so the little-endian nature of the RIFF/WAV format seems to be communicated somewhere, but I'm not sure where. Finally, what is the byte value range I should expect to see for signed 16-bit PCM encoding? I would expect to see values ranging from −32768 to 32767 in logcat from the Log.d(...) invocation above, but instead the results tend to be within the range of -100 to 100 (with some outliers beyond that). Could combined byte values beyond the 16-bit range account for the noise, perhaps?
thanks, CCJ
UPDATE: major thanks to Bjorne Roche and William the Coderer! I now read in the audio data to short[] structures, endianness of the DataInputStream is accounted for using the EndianInputStream from William (http://stackoverflow.com/questions/8028094/java-datainputstream-replacement-for-endianness) and the combination method has been changed to this:
//Audio Chimera methods!
public short[] makeChimeraAll(int offset){
//bigData and littleData are each short arrays, populated elsewhere
int intBucket = 0;
for(int i=offset;i<bigData.length;i++){
if(i < littleData.length){
intBucket = bigData[i] + littleData[i];
if(intBucket > SIGNED_SHORT_MAX){
intBucket = SIGNED_SHORT_MAX;
}
else if (intBucket < SIGNED_SHORT_MIN){
intBucket = SIGNED_SHORT_MIN;
}
bigData[i] = (short) intBucket;
}
else{
//leave bigData alone
}
}
return bigData;
}
the hybrid audio output quality with these improvements is awesome!