0

According to what I have read on the internet, the normal range of fundamental frequency of female voice is 165 to 255 Hz . I am using Praat and also python library called Parselmouth to get the fundamental frequency values of female voice in an audio file(.wav). however, I got some values that are over 255Hz(eg: 400+Hz, 500Hz). Is it normal to get big values like this?

sttc1998
  • 51
  • 2
  • 10

1 Answers1

3

It is possible, but unlikely, if you are trying to capture the fundamental frequency (F0) of a speaking voice. It sounds likely that you are capturing a more easily resonating overtone (e.g. F1 or F2) instead.

My experiments with Praat give me the impression that the with good parameters it will reliably extract F0.

What you'll want to do is to verify that by comparing the pitch curve with a spectrogram. Here's an example of a fitting made by Praat (female speaker):

Spectrogram and F0

You can see from the image that

  • Most prominent frequency seems to be F2
  • Around 200 Hz seems likely to be F0, since there's only noise below that (compared to before/after the segment)
  • Praat has calculated a good estimate of F0 for the voiced speech segments

If, after a visual inspection, it seems that you are getting wrong results, you can try to tweak the parameters. Window length greatly affects the frequency resolution.

If you can't capture frequencies this low, you should try increasing the window length - the intuition is that it gives the algorithm a better chance at finding slowly changing periodic features in the data.

Sami Hult
  • 3,052
  • 1
  • 12
  • 17
  • Thank you for your reply. I figured out the problem. It seems like there is something wrong about my silence threshold parameter. I was using -0.45dB but when I changed it to 0.1dB, every values seems to be falling between the range. But I doubted that this is the best solution. What is the normal silence threshold that are used in audio analysis? – sttc1998 Dec 16 '18 at 11:30
  • @sttc1998 I doubt there is such thing as "normal silence threshold". A suitable value depends on your purpose and in which domain you are noise gating (time, frequency). If the purpose is to find voiced segment's fundamental frequency, I wouldn't noise gate at all: autocorrelation on cepstrum will be somewhat immune to reasonable noise. – Sami Hult Dec 16 '18 at 12:57