6

I'm woking on a project for my thesis in computer science. It aims at implementing an application that allows the user to sing or whistle a melody in the pc's or smartphone's microphone and will identify which notes have been sung.

I need to first study the theory that is the basis of such a program and then implement it in matlab, java or c.

I have found a lot of information already on Stackoverflow, but I am a little confused (regarding FFT, pitch, etc.). I would be grateful if anyone could tell me what I should study and what the steps to implementation would be.

assylias
  • 321,522
  • 82
  • 660
  • 783
Francesco
  • 69
  • 1
  • 5
  • 3
    This is a very broad question. Is there a specific programming issue that you need help with? What have you tried? – Ted Hopp Dec 03 '12 at 16:44
  • I need to first study the theory that is the basis of such a program and then implement it in matlab or java or c. – Francesco Dec 03 '12 at 16:47
  • At first study fourier-transform, after that fast-fourier-transform and windowed fourier-transform. If you understand both, try to use a lib (e.g. [fftw](http://www.fftw.org/) ) on some waves and play with the results. – Peter Schneider Dec 03 '12 at 16:47
  • 1
    This might help as a starting point for finding references (using Wikipedia directly in a thesis isn't a wise idea :)): http://en.wikipedia.org/wiki/Pitch_detection_algorithm – biziclop Dec 03 '12 at 16:48
  • 1
    You need to be award that this is a hard subject, unless you make very strict restrictions on what sounds you can handle. Consider starting with a tone generator generating a single sine. – Thorbjørn Ravn Andersen Dec 03 '12 at 16:50
  • 1
    Also, what has your advisor suggested you do now? – Thorbjørn Ravn Andersen Dec 03 '12 at 16:50
  • Reading some posts of this blog known alas only seeking news of the argument (and I do my compliments) I already read some article about the FFT or the pitch detector, but they are in confusion about the things: 1.il my thesis project is feasible? 2. Once designed FFT and more, how should I set my implementation: that is how to take an input and manage within the program, or a complete which lib to use?. – Francesco Dec 03 '12 at 16:52

1 Answers1

1

I don't think this is feasible as a thesis for a single person if you attempt to do it all from scratch. But it may be feasible if you integrate existing pieces together.

I'd look for some open source libraries first and try them as is. That might impose some limitations on what you can do. But that's fine, because the whole thing is quite big. It might make sense to integrate a quick and dirty solution first, somehow. For example, by taking a recorded sound file and using a library to recognize sounds there. Then adding integration with other stuff, fancy output, audio recording, etc.

I mean something like this: https://dsp.stackexchange.com/a/2462

There may or may not be much open stuff around, as the commercial interest to things like this seems to be high.

Community
  • 1
  • 1
full.stack.ex
  • 1,747
  • 2
  • 11
  • 13
  • In fact, I know it's a huge work for one person in fact with regard to the implementation should cover a small part of that is to decipher the note sung in a musical note – Francesco Dec 03 '12 at 19:08
  • @ Francesco: a good piece of open source can help anyway because you can borrow the source and/or approach. And even contact the authors! – full.stack.ex Dec 03 '12 at 19:15