Questions tagged [openai-whisper]

Whisper is a general-purpose speech recognition library by OpenAI.

175 questions
1
vote
1 answer

Decode mediaRecorder audio file in Python

I am sending 2s audio files from my typescript frontend to my python Flask backend, where I need to access to this files and turn them to mp3 files for the whisper model. My funtion to send the data is: function sendData() { //This line checks if…
Hugo Albert
  • 31
  • 1
  • 3
1
vote
3 answers

transformers.js with whisper and return_timestamps

I am new to both transformers.js and whisper trying to make return_timestamps parameter work... I managed to customize script.js from transformer.js demo locally and added data.generation.return_timestamps = "char"; around line ~447 inside…
Yoz
  • 707
  • 6
  • 20
1
vote
1 answer

TypeError when using Whisper AI to transcribe

I want to transcribe an audio file by using Whisper AI. I learned from an article https://www.assemblyai.com/blog/how-to-run-openais-whisper-speech-recognition-model/ That using python version 3.8.16 and pytorch 1.12.1 would do the job. I also…
1
vote
1 answer

How to convert a buffer to a readable file in NodeJS

I receive a buffer through an input: const fileData = Buffer.concat(chunks); I then send this input into OpenAI's Whisper which excepts a file const resp = await openai.createTranscription( //@ts-ignore fileData, "whisper-1", ); This doesn't…
1
vote
2 answers

how to disable sending telemetry and log data to open ai when using Azure OpenAI Service

I am working with security data and need to make sure we disable sending telemetry and log data to open ai when using Azure OpenAI Service does anyone know how to go about it
prajwal rao
  • 87
  • 1
  • 9
1
vote
1 answer

Transcription via OpenAi's whisper: AssertionError: incorrect audio shape

I'm trying to use OpenAI's open source Whisper library to transcribe audio files. Here is my script's source code: import whisper model = whisper.load_model("large-v2") # load the entire audio file audio =…
1
vote
1 answer

How do I finetune a Whisper ASR model by streaming the common_voice dataset?

I am trying to download a huge voice dataset from Huggingface. To avoid the disk space usage,trying to download the data in Streaming Mode. Everything goes fine, until the time reaches training the model. common_voice =…
1
vote
3 answers

On Whisper API, when I try to use a python script for transcribing audio files in bulk, I can't get the correct response_format ('srt' or 'vtt') work

I'm using this code for connecting to Whisper API and transcribe in bulk all mp3 in a folder to both srt and vtt: import requests import os import openai folder_path = "/content/audios/" def transcribe_and_save(file_path, format): url =…
waghler
  • 41
  • 6
1
vote
1 answer

Python can't import Whisper - ctypes.util.find_library('c') returns None

I use PyCharm on Windows 10 and run "Python 3.9.9". It's my code. I want to import whisper but get error in whisper.py line 70. I have checked that line: libc_name = ctypes.util.find_library('c'). find_library function can't find 'c' library and…
Mustafa
  • 87
  • 1
  • 8
1
vote
0 answers

No such file or directory: 'ffmpeg' on MacOS in Python OpenAI Whisper

I am using MacOS (Apple Silicon) and I am trying to use the whisper module from OpenAI in Python. My code is this: import whisper file_path = "4547.mp3" model = whisper.load_model("base") result =…
1
vote
1 answer

No such file or directory: 'ffmpeg' on MacOS in Python

I am using MacOS (Apple Silicon) and I am trying to use the whisper module from OpenAI in Python. My code is this: import whisper file_path = "4547.mp3" model = whisper.load_model("base") result =…
1
vote
1 answer

spaCy sentence separation with dictionary source from OpenAI Whisper / WhisperX?

WhisperX is a whisper extension that does a really excellent job of text to speech with per-word timestamps. I'd like to use spaCy to split up the text strings into sensible clauses but maintain a connection to the source dictionary so the result…
Dom I Yes
  • 11
  • 1
1
vote
0 answers

Trying to numerically match python Log-Mel Spectrogram in Accelerate / Swift

I am working on a native port of OpenAI's Whisper for macOS and iOS via CoreML and Accelerate / AVFoundation, and in doing so noticed numerical differences in my Log Mel Spectrogram and code Whispers. This python Notebook extracts the Log Mel…
vade
  • 702
  • 4
  • 22
1
vote
1 answer

How to use whisper.transcribe with ogg audio in a byte variable

I am trying to use the transcribe method from OpenAI's whisper python module without loading the audio file from a file system. In my code I have downloaded an ogg audio file from a matrix server repository and now want to transcribe that.…
Tupsi
  • 33
  • 6
1
vote
2 answers

How do I run Whisper on an entire directory?

I'd like to transcribe speech to text using Whisper. I have been able to successfully run it on a single file using the command: whisper audio.wav I'd like to run it on a large number of files in a single director called "Audio" on my desktop. I…
Alligator
  • 691
  • 3
  • 11
  • 21