1

I'm very new to opensl es. I'm currently experimenting with the recording and playback features of opensl es for android. Right now I have a recording function which stores data in a buffer queue. I can then playback the buffer queue. Would anyone be able to explain how I can correctly manipulate the data in the buffer queue? so the playback sounds different from the recording.

My current configuration:

sampleFormat.pcmFormat_ = static_cast<uint16_t>(engine.bitsPerSample_);
//the buffer
uint8_t *buf_;

Is there any type of conversion or decoding I need to do to the data in the buffer before manipulating it?

I would really appreciate some help.

  • I think this reference involves every answer related to `Audio_Processing` whether it is *Pre-Processing* or *Post-Processing*: [Android_Audio_Processing_Using_WebRTC](https://github.com/mail2chromium/Android-Audio-Processing-Using-WebRTC), You can also visit this reference: https://stackoverflow.com/a/58546599/10413749 – Muhammad Usman Bashir Apr 07 '20 at 08:48

1 Answers1

0

Your question is broad, what I can do is tell you how you are supposed to use it, and how you could manipulate audio data obtained from recording.

1) Once you setup your OpenSL_ES engine, recorder and player properly (many examples out there), you have given OpenSL_ES a buffer where to read pcm data from mic, and also a buffer where to read from data you would like to provide for the sink of playback, along with 2 callback functions which will be called upon completion, once the process of reading data has finished (after some time according to your settings like sample rate, size of buffer, etc), the record callback is called, from a thread created by OpenSL_ES which depending on the device and configuration might be a high priority thread usually called fast track (so you are not working on your thread in the callback, but in OpenSL_ES' thread and have to be careful not to do blocking operations there). Now if what you want is to playback audio as fast as posible, work your audio signal processing from inside the callback, if response time is not too important for you, you may use the callback as a signal for your thread to start reading process audio data in the buffer as you wish. In both cases to playback the audio you must enqueue the data (processed or unprocessed) for the playback process (playback also calls player callback upon finishing).

2) Now, if you want to process audio, you need to apply filters, there are many kinds of audio signal filters that can be applied, you should look for dynamic filters in case of real time playback. (some filters require lot of data to start processing and may be bad for real time, some others are optimized to use small chunks of data and dynamically adapt output). So you would need to make a chain of filters in a certain order to obtain what you want. The audio world is huge, you need to read quite a lot to start understanding audio processing. Audio performance is another thing and depends directly from the device you have (hard, soft).

3) Data manipulation to the buffer you obtain depends on your processor. For instance endianess, some processors may work with little or big endian and you get your data in big endian format. There is no compression so pcm data is ready for processing. (if you would like to create a wav from it you only need to add a wave header and add pcm data in the data chunk of the header, if you want other format like mp3 you also need to process your data with a compression algorithm according to the format you would like and add that data to the proper header)

Also to playback data through OpenSL_ES you need uncompressed audio data, so you can't play mp3 directly, you need to uncompress it into pcm data first

This is the basic functioning of OpenSL_ES, hope that answers your question. If something is unclear let me know.

PS: Android says Audio manipulation is easier now with the new library AAudio, which promises to accomplish the same tasks as OpenSL_ES with a third of it's complexity (there might be some issues with latency, some people have encountered but I bet they are being fixed as you read)

alexm
  • 1,285
  • 18
  • 36
  • 1
    Thanks for the great write up @alexm If I want to invert the audio that is recorded by the microphone how would I do it? More importantly what would I need to do to the buffer to achieve this... I'm assuming that I have to change the buffer values in the audio player callback. I've already tried negating the values in the buffer however it hasn't worked, the audio just plays back the same as If i didn't change the values. – Ridoy Farhad Feb 15 '18 at 19:43
  • Thanks for valuable inputs. Can u help me that can i whenever a video is playing on Android TV. Can i manipulate the audio coming from it. For example whatever the audio coming from it, i need to apply speech to text and add beeps at abusive words before it reaches to the inbuilt speaker. Can i do this type of manipulation in real time. Please share ur inputs if its even possibel or not. @RidoyFarhad – Piyush Aggarwal Apr 26 '22 at 03:15
  • @Ridoy Farhad sorry aobut the late answer, for some reason I've never received your message notification. I think the answer is too late by now, but it might help other people. Audio processing 1) depends on input configuration of OpenSL_ES, it can be shorts or floats, 2) after you are manipulating the right type of data, it will depend audio processing algorithm 3) don't forget that your buffer has left and right audio data, so you should process them separately when processing a string and then put it back in place in the output buffer (could be the same buffer) – alexm Apr 27 '22 at 17:54
  • @Plyush Aggarwal yes, it is possible, based on the previous comment I made, you could send your input data to a processor that detects the sounds you want to beep and replaces it. Now, have in mind that you would be needing and IA algorithm (hard part for realtime), which tend to be quite expensive in terms of processing, depending on the algorithm you would need to train it first, and after you have your trained network pass the data through the algorithm, detect and replace audio(This is the easy part). Not impossible, but a lot of work ahead. – alexm Apr 27 '22 at 17:59