Animation like Siri when you speak

Question

I would like to know if there is a way to do an animation like Siri when you speak. Can you tell me if you know a way to do this in C#.

Yes. There are ways. If you have an R&D department and a few hundred million to drop, let me know - that sounds like an interesting project. IF, however, you're looking for something much, MUCH more modest, try Professor Google: https://www.google.com/search?q=c-sharp+voice+interaction&ie=&oe= if you get stuck on anything in particular, let us know and we'll help! — Shannon Holsinger, Sep 16 '16 at 12:20
I was speaking about a line that move when the mic capture a sound. Like an equalizer when you listen music. I wasn't speaking about making a new Siri. — FlorianSL, Sep 16 '16 at 12:22
It must've been the whole "like Siri" thing that threw me. If you just want a spectrum analyzer, take a look here: https://www.google.com/search?q=c-sharp+spectrum+analyzer+sound&ie=&oe= the first link I posted shares the current (non-cutting-edge) knowledge of voice. Search.Play.Learn.Then ask! Good luck. — Shannon Holsinger, Sep 16 '16 at 12:26
It's the easiest exemple I found. And I forgot the name of this thing. Ok I already know how work the speech recognition. Thank you for your link — FlorianSL, Sep 16 '16 at 12:30
You're right - my sincere apologies to OP. I was having a little fun with you when you were asking a perfectly valid question. I hope my comments haven't interfered with your ability to get a reasonable answer. — Shannon Holsinger, Sep 16 '16 at 14:33

score 1 · Accepted Answer · answered Sep 16 '16 at 12:26

Obviously there is a way to achieve this - but is it worth the effort?

What you would need: A audio input stream. An spectrum analyzer (something like what this does: http://www.qsl.net/dl4yhf/spectra1.html - there are more than enough signal-processing papers out there). An digestive format to display it. A new view (depending on the UI you chose) that can display this data.

The problems are multi-variant here and out of scope to discuss in detail (and your question is so broad and informative, that I am not willing to go into too much detail). Problems you will stumble upon are: Audio Input Lag, Processing Lag, Viewport-Lag and consuming the data and probably a lot of issues on rendering it fast enough with a standard MVC framework.

The fluidity of Siris UI for this is achieved through rendering the view on the GPU and having a proper audio/data filter, that smooths out spikes. That makes smooth transitions possible and doesn't look nearly as aggressive as a rapid change of an exact spectrogramm.

Not to mention the massive multi-million-dollar proprietary algorithms that process the data and hook it to back-end functionality. Because what good is all the work of getting a computer to listen to you if all it does is listen? — Shannon Holsinger, Sep 16 '16 at 12:28
Wut? He asked specifically about the animation when you speak - not processing the data. However: If you want to analyze natural language patterns in a very tiny scope, that's not really expensive. If you do, there's are already alternatives - for example Microsoft has a free service that transforms natural language into API calls (which has no current pricing and is in it's early stages. I don't have the name on top of my head - but if you're interested I can search it up. Edit: You may not have asked for it, but here: https://www.luis.ai/ — Mio Bambino, Sep 16 '16 at 12:36

Animation like Siri when you speak

1 Answers1