6

I'm trying to do a speech-to-text recognition for a wav. file I have, with Google, Google_Cloud, and Houndify.

I've noticed that with the latter two, they show no problem with profanities and but the Google speech recognizer filters the word, for example, f***, s***.

And this creates a problem for me as I want to do a sentimental analysis with LIWC and the program gives no profanity weights for words filtered like f***.

I've tried all of the above.

(1) Turning profanity filter off

recognizer_instance.recognize_google(audio_data: AudioData, key: Union[str, None] = None, language: str = "en-US", , pfilter: Union[0, 1], show_all: bool = False) -> Union[str, Dict[str, Any]]

https://github.com/Uberi/speech_recognition/blob/master/reference/library-reference.rst

(2) Remove profanity censor from Google Speech Recognition

But non of them solved the problem

r.recognize_google(example_audio)

---> what the f*** is wrong with you

But then,

r.recognize_google(example_audio, pfilter=0)

Gives

TypeError                                 Traceback (most recent call last)
<ipython-input-21-b158a03c879c> in <module>
----> 1 r.recognize_google(example_audio, pfilter=0)

TypeError: recognize_google() got an unexpected keyword argument 'pfilter'

How should I solve this problem?

I know that many solutions written on Stackoverflow are referring to recognizer for Google Cloud API. I do have Google_Cloud (r.recognize_google_cloud) working, so I want a solution for recognize_google not Google Cloud. I want to compare the results.

vezunchik
  • 3,669
  • 3
  • 16
  • 25
KKKM
  • 61
  • 2

2 Answers2

0

I am hitting the same thing. Looking at code in github here https://github.com/Uberi/speech_recognition/blob/master/speech_recognition/init.py I can see that the pfilter parameter is supported, as the documentation suggests, but the version I've got from pip install, which also claims to be 3.8.1 just has pfilter deleted.

However, looking at the implementation, it just affects whether "pfilter": 0 | 1 is added to the dictionary for the request, so just edit your copy locally to add this to the dictionary is one route forward.

Very frustrating to have this sort of inconsistency :(

  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jun 08 '22 at 04:27
  • This does not really answer the question. If you have a different question, you can ask it by clicking [Ask Question](https://stackoverflow.com/questions/ask). To get notified when this question gets new answers, you can [follow this question](https://meta.stackexchange.com/q/345661). Once you have enough [reputation](https://stackoverflow.com/help/whats-reputation), you can also [add a bounty](https://stackoverflow.com/help/privileges/set-bounties) to draw more attention to this question. - [From Review](/review/late-answers/31965475) – Nevenoe Jun 09 '22 at 04:06
0

You need to open __init__.py (speech_recognition),

find

def recognize_google(self, audio_data, key=None, language="en-US", show_all=False):

and edit to

def recognize_google(self, audio_data, key=None, language="en-US", show_all=False, pfilter=1):

The next step

find

url = "http://www.google.com/speech-api/v2/recognize?{}".format(urlencode({
            "client": "chromium",
            "lang": language,
            "key": key,
        }))

and edit to

url = "http://www.google.com/speech-api/v2/recognize?{}".format(urlencode({
            "client": "chromium",
            "lang": language,
            "key": key,
            "pFilter": pfilter,
        }))

and

r.recognize_google(example_audio, pfilter=0)

will start working