Why does PitchDetection work better with whistling?

Question

I am playing around with the UtterAsterisk example program that comes with TarsosDSP. The goal of this program is to show horizontal bars that indicate the note a user should make. A vertical bar moves from left to right to indicate to the user the correct timing of when to perform which notes. The user gets points depending on if the user made the correct note for the correct duration of time.

Link to screenshot of application: https://0110.be/files/photos/392/UtterAsterisk.png

There are 3 sections in this program:

select audio input
select detection algorythm
visual representation of expected notes vs actual notes produced: A little black square is made every X milliseconds that represents the note made by the user. In the title of this section (in the latest version of the program), it says "whistling works best".

I am wondering why does this code work best with whistling?

As background information, I am trying to make a quick prototype for a similar program, but where the user would produce non-whistling, non-vocal (no speech) sounds (like animal sounds) and would need to be matched for correctness.

I have tried whistling the notes indicated on the program and it does work pretty nicely (except for the fact that I'm terrible at whistling!).

I have tried selecting different detection algorythms, but the note that the sound makes doesn't always register in the 3rd section when I do non-whistling sounds.

I have a feeling that whistling creates a single note, whereas making a quacking sound (like a duck) is actually a harmonics (hope I got this right: Several notes mixed to produce a sound).

Line 151, 152: https://github.com/JorenSix/TarsosDSP/blob/master/src/examples/be/tarsos/dsp/example/UtterAsterisk.java

// add a processor, handle percussion event.
dispatcher.addAudioProcessor(new PitchProcessor(algo, sampleRate, bufferSize, this));

The PitchProcessor I believe will only handle a single peak, as it returns a pitchDetectionResult, which contains only a single frequency (line 59): https://github.com/JorenSix/TarsosDSP/blob/master/src/core/be/tarsos/dsp/pitch/PitchDetectionResult.java

Unfortunately, I am mostly beginning in the field of digital signal processing and could use some help to understand how whistling is better in this particular application. If my intuition points to being right (whistling = single note), how could one be able to do the same basic thing that this program does (compare user made sound of animal with a recording for a match)?

Thank you for your input!

I'm voting to close this question as off-topic because it is about pitch detection, audio harmonics and whistling and does not appear to be about programming. — Elliott Frisch, Jan 07 '19 at 20:48
Fair point. Is there a way to move my question to the DSP / TarsosDSP section? I thought this was only controlled by the tags I added, but it appears I was wrong. — David Poisson, Jan 07 '19 at 21:01

score 0 · Accepted Answer · answered Jan 07 '19 at 20:55

It seems likely that the answer is right here.

where the user would produce non-whistling, non-vocal (no speech) sounds (like animal sounds) and would need to be matched for correctness.

It seems likely that those "sounds" are the result of multiple tones, where whistling (human whistling) is likely to product a single tone.

For a comparison, test the difference between the sound of a single note (or key) played on a piano and a chord (multiple notes) played on a piano.

Another option is using a telephone to produce a dial sound (ex press 7) vs whistling. The telephone produces DTMF (Dual Tone blah blah) sounds.

You are correct DwB. It appears that just the fact of putting together a long winded question stiching everything I knew about the subject brings up the answer clearly. Thanks for pointing me to it! I will accept your answer as the right one. — David Poisson, Jan 08 '19 at 18:37

Why does PitchDetection work better with whistling?

1 Answers1