Whisper is a general-purpose speech recognition library by OpenAI.
Questions tagged [openai-whisper]
175 questions
1
vote
1 answer
Decode mediaRecorder audio file in Python
I am sending 2s audio files from my typescript frontend to my python Flask backend, where I need to access to this files and turn them to mp3 files for the whisper model.
My funtion to send the data is:
function sendData() {
//This line checks if…

Hugo Albert
- 31
- 1
- 3
1
vote
3 answers
transformers.js with whisper and return_timestamps
I am new to both transformers.js and whisper trying to make return_timestamps parameter work...
I managed to customize script.js from transformer.js demo locally and added data.generation.return_timestamps = "char"; around line ~447 inside…

Yoz
- 707
- 6
- 20
1
vote
1 answer
TypeError when using Whisper AI to transcribe
I want to transcribe an audio file by using Whisper AI.
I learned from an article https://www.assemblyai.com/blog/how-to-run-openais-whisper-speech-recognition-model/
That using python version 3.8.16 and pytorch 1.12.1 would do the job.
I also…

Chris
- 11
- 2
1
vote
1 answer
How to convert a buffer to a readable file in NodeJS
I receive a buffer through an input:
const fileData = Buffer.concat(chunks);
I then send this input into OpenAI's Whisper which excepts a file
const resp = await openai.createTranscription( //@ts-ignore
fileData,
"whisper-1",
);
This doesn't…

A Man Who Needs Help
- 35
- 6
1
vote
2 answers
how to disable sending telemetry and log data to open ai when using Azure OpenAI Service
I am working with security data and need to make sure we disable sending telemetry and log data to open ai when using Azure OpenAI Service
does anyone know how to go about it

prajwal rao
- 87
- 1
- 9
1
vote
1 answer
Transcription via OpenAi's whisper: AssertionError: incorrect audio shape
I'm trying to use OpenAI's open source Whisper library to transcribe audio files.
Here is my script's source code:
import whisper
model = whisper.load_model("large-v2")
# load the entire audio file
audio =…

muratowski
- 25
- 4
1
vote
1 answer
How do I finetune a Whisper ASR model by streaming the common_voice dataset?
I am trying to download a huge voice dataset from Huggingface. To avoid the disk space usage,trying to download the data in Streaming Mode. Everything goes fine, until the time reaches training the model.
common_voice =…

PJAX
- 11
- 1
1
vote
3 answers
On Whisper API, when I try to use a python script for transcribing audio files in bulk, I can't get the correct response_format ('srt' or 'vtt') work
I'm using this code for connecting to Whisper API and transcribe in bulk all mp3 in a folder to both srt and vtt:
import requests
import os
import openai
folder_path = "/content/audios/"
def transcribe_and_save(file_path, format):
url =…

waghler
- 41
- 6
1
vote
1 answer
Python can't import Whisper - ctypes.util.find_library('c') returns None
I use PyCharm on Windows 10 and run "Python 3.9.9". It's my code. I want to import whisper but get error in whisper.py line 70. I have checked that line: libc_name = ctypes.util.find_library('c'). find_library function can't find 'c' library and…

Mustafa
- 87
- 1
- 8
1
vote
0 answers
No such file or directory: 'ffmpeg' on MacOS in Python OpenAI Whisper
I am using MacOS (Apple Silicon) and I am trying to use the whisper module from OpenAI in Python. My code is this:
import whisper
file_path = "4547.mp3"
model = whisper.load_model("base")
result =…

Harper Bledsoe
- 31
- 5
1
vote
1 answer
No such file or directory: 'ffmpeg' on MacOS in Python
I am using MacOS (Apple Silicon) and I am trying to use the whisper module from OpenAI in Python. My code is this:
import whisper
file_path = "4547.mp3"
model = whisper.load_model("base")
result =…

Harper Bledsoe
- 31
- 5
1
vote
1 answer
spaCy sentence separation with dictionary source from OpenAI Whisper / WhisperX?
WhisperX is a whisper extension that does a really excellent job of text to speech with per-word timestamps.
I'd like to use spaCy to split up the text strings into sensible clauses but maintain a connection to the source dictionary so the result…

Dom I Yes
- 11
- 1
1
vote
0 answers
Trying to numerically match python Log-Mel Spectrogram in Accelerate / Swift
I am working on a native port of OpenAI's Whisper for macOS and iOS via CoreML and Accelerate / AVFoundation, and in doing so noticed numerical differences in my Log Mel Spectrogram and code Whispers.
This python Notebook extracts the Log Mel…

vade
- 702
- 4
- 22
1
vote
1 answer
How to use whisper.transcribe with ogg audio in a byte variable
I am trying to use the transcribe method from OpenAI's whisper python module without loading the audio file from a file system. In my code I have downloaded an ogg audio file from a matrix server repository and now want to transcribe that.…

Tupsi
- 33
- 6
1
vote
2 answers
How do I run Whisper on an entire directory?
I'd like to transcribe speech to text using Whisper. I have been able to successfully run it on a single file using the command:
whisper audio.wav
I'd like to run it on a large number of files in a single director called "Audio" on my desktop. I…

Alligator
- 691
- 3
- 11
- 21