1

I'm using this code for connecting to Whisper API and transcribe in bulk all mp3 in a folder to both srt and vtt:

import requests
import os
import openai

folder_path = "/content/audios/"
def transcribe_and_save(file_path, format):
    url = 'https://api.openai.com/v1/audio/transcriptions'
    headers = {'Authorization': 'Bearer MyToken'}
    files = {'file': open(file_path, 'rb'), 
            'model': (None, 'whisper-1'),
            'response_format': format}
    response = requests.post(url, headers=headers, files=files)
    output_path = os.path.join(folder_path, os.path.splitext(filename)[0] + '.' + format)
    with open(output_path, 'w') as f:
        f.write(response.content.decode('utf-8'))

for filename in os.listdir(folder_path):
    if filename.endswith('.mp3'):
        file_path = os.path.join(folder_path, filename)
        transcribe_and_save(file_path, 'srt')
        transcribe_and_save(file_path, 'vtt')
else:
    print('mp3s not found in folder')

When I use this code, I'm getting the following error:

"error": {
    "message": "1 validation error for Request\nbody -> response_format\n  value is not a valid enumeration member; permitted: 'json', 'text', 'vtt', 'srt', 'verbose_json' (type=type_error.enum; enum_values=[<ResponseFormat.JSON: 'json'>, <ResponseFormat.TEXT: 'text'>, <ResponseFormat.VTT: 'vtt'>, <ResponseFormat.SRT: 'srt'>, <ResponseFormat.VERBOSE_JSON: 'verbose_json'>])",
    "type": "invalid_request_error",
    "param": null,
    "code": null
  }

I've tried with different values, but either don't work or I'm only receiving the transcription as a object in plain text, but no srt or vtt. I'm expecting to get srt and vtt files in the same folder as where audios are

Thanks, Javi

waghler
  • 41
  • 6

3 Answers3

3

I've found the solution, the problem was in one of the parameters 'response_format': (None, output_format):

def transcribe_and_save(file_path, output_format):
    url = 'https://api.openai.com/v1/audio/transcriptions'
    headers = {'Authorization': 'Bearer myToken'}
    files = {'file': open(file_path, 'rb'),
             'model': (None, 'whisper-1'),
             'response_format': (None, output_format)}
    response = requests.post(url, headers=headers, files=files)
    output_path = os.path.join(folder_path, os.path.splitext(os.path.basename(file_path))[0] + '.' + output_format)
    with open(output_path, 'w') as f:
        f.write(response.content.decode('utf-8'))

for filename in os.listdir(folder_path):
    if filename.endswith('.mp3'):
        file_path = os.path.join(folder_path, filename)
        transcribe_and_save(file_path, 'srt')
        transcribe_and_save(file_path, 'vtt')
else:
    print('mp3s not found in folder')
waghler
  • 41
  • 6
0

I am not sure about the whisper api, but you seem to be using an already existing python function as a parameter name. Perhaps this could be a reason why it is not working, as the function format is being used when calling the endpoint instead of the parameter you passed in.

Try changing the parameter name to something other than format and change the value being used for response_format.

Thomasssb1
  • 36
  • 4
  • Thanks, Thomasssb1! Yes, you're right. my mistake :). I changed that, but I'm still getting the same error. if I delete the response_format parameter: files = {'file': open(file_path, 'rb'), 'model': (None, 'whisper-1')} #'response_format': response_format} I'm not getting an error, but the object with the transcribed text: {"text":"This is Stella. She's eight."} But what I want to have are the srt and vtt format, not just the transcription. Thanks! – waghler Mar 06 '23 at 11:09
0

Here's a working Solution for single files:

import requests
import os

OPENAI_API_KEY = "123xyzxyzxyzxyzxyzxyzxyzxyz"

token = f"Bearer {OPENAI_API_KEY}"

url = "https://api.openai.com/v1/audio/transcriptions"
model_name ="whisper-1"

headers ={
    "Authorization": token,
    "Content-Type": "multipart/form-data"
}

file_path ="1.mp3"
with open(file_path,"rb") as file:
    file_content = file.read()

payload = {
    "name": os.path.basename(file_path),
    "response_format": "json",
    "prompt": "transcribe this Chapter",
    "language": "de",
    "model": model_name
}

files = {
    "file": (os.path.basename(file_path), file_content, "audio/mp3")
}

response = requests.post(url, headers=headers, data=payload, files=files)


print(response.text)
dazzafact
  • 2,570
  • 3
  • 30
  • 49