Using speech to text with googlelanguageR produces NULL transcripts

Question

I'm using the R package 'googleLanguageR' to transcribe various 30 second audio files (over 500 so want to automatize this). I've followed all the steps in the googleLanguageR tutorials, got my key, and authenticated through R.

I'm able to transcribe the test audio (.wav) that comes with the package, but whenever I apply the same function to my files (.mp3), I get NULL for both transcript and timings.

This is the code provided in tutorials:

# get the sample source file
test_audio <- system.file("woman1_wb.wav", package = "googleLanguageR")
gl_speech(test_audio)$transcript

If I use the same for my file, I get an empty element, so I've tried the following with no luck:

test_audio <- "/audio_location/filename.mp3"
gl_speech(test_audio)$transcript

Has anybody encountered a similar problem with this package or have any suspicions of why it produces NULL transcripts?

You need to specify the format of the audio file if it is not .wav. See the website reference: http://code.markedmondson.me/googleLanguageR/reference/gl_speech.html mp3 are not that good for transcribing as a lot of the audio info is lost, so you may need to try find the audio in another format — MarkeD, Feb 05 '20 at 06:51
Thanks! I figured out you have to convert everything to FLAC and be specific about the sample rate hertz, otherwise it won't transcribe or it will transcribe with more errors. Specifically, I used this code with ffmpeg to do so (in this case, it had to be 44100 but may vary depending on file type): `rem Mono Channel, 32kbps. for %%f in (*.mp3) do ffmpeg -i "%%f" -acodec flac -bits_per_raw_sample 16 -ar 44100 -ac 1 "%%~nf.flac" pause ` — op4, Feb 06 '20 at 17:50

Using speech to text with googlelanguageR produces NULL transcripts

0 Answers0