0

I would like to use AVAudioFoundation for microphone input for speech detection as shown in this ios example and simultaneously detect the pitch of the user’s voice through the same microphone input, using AudioKit. The latter API is probably a wrapper around the first, but has its own classes and initialization. Is there a way to provide AudioKit with an existing microphone configuration like in the speech example, or some alternative way to use the Speech API and AudioKit’s microphone pitch detection API simultaneously? How might I achieve this?

EDIT: The question is a little more complex

I need to be able to synchronize 3 things: touch events, audio kit detection times, and speech detection times. Each of these operates on a different timebase. Speech gives me segment timestamps with respect to the beginning of audio recording. The timestamp for UITouch events will be different. I am not sure what AudioKit uses for its timestamps. There is some mention of host time and AV timestamps here, but I'm not sure this will get me anywhere. Speech and audio synchronization is a little unclear. May I have a lead for how this might work?

synchronizer
  • 1,955
  • 1
  • 14
  • 37
  • 1
    I think you can do that; forget the first part, setup the session using AudioKit instead; Look into how to pass the buffer data into the audio speech recognition system, find ways to convert your data into the desired input. For Pitch check the "microphone analysis" example to get started. Also, AudioKit Pro released an app called Hey Metronome, that seems to use the Speech API. – punkbit May 21 '20 at 16:00
  • @punkbit So doing both setup processes independently would clobber the input/output I gather? Hey Metronome seems to use 3rd party speech recognition software. Hmm. Would you know of any examples of how to use the AKMicrophone with the pitch recognizer plus the Speech API? I looked briefly through the AudioKit source code, but there was no obvious way. I’d want to get the unprocessed input as well, whereas for the pitch detection example, there is some post processing including amplification. – synchronizer May 21 '20 at 16:30
  • I haven't tried to have two separate processes, so not sure if it'd generate problems, just seems easier to manage one. I do use AVFoundation and AudioKit; You can pass the microphone output and transform to the desired input of your preferred speech detection (I haven't read the docs, otherwise would let you know what, but just find how to setup, that's all). For the mic analysis check the "microphone analysis" example. – punkbit May 21 '20 at 16:35
  • I did find this. Actually it looks close to perfect: https://github.com/warpling/SpeechRecognition – synchronizer May 21 '20 at 16:39
  • Ok, when done, don't forget to answer your own question. – punkbit May 21 '20 at 16:44
  • Yep, but I have to try it first. Since you seem to know about the Speech API/Audio, I did have another question about synchronizing the asynchronous results of Speech with other elements, but I will post another question about that. – synchronizer May 21 '20 at 17:22
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/214361/discussion-between-synchronizer-and-punkbit). – synchronizer May 21 '20 at 18:29
  • @punkbit Would you have thoughts on what I mentioned in the chat above? Thanks for your time. – synchronizer May 22 '20 at 18:12
  • That'll be up to you to implement I'm afraid I can't help. A good read regarding shared clock and synchronization is http://atastypixel.com/blog/experiments-with-precise-timing-in-ios/ – punkbit May 23 '20 at 11:19

0 Answers0