I've been at this all night, I'm attempting to record myself on my iPhone via expo using expo-av (records speech via iPhone) and upload it to openai's transcriptions endpoint using whisper-1 model.
The file is saved as mp4, I convert it to a base64 string, I have confirmed the base64 content is infact mp4:
base64 to file converting tool
uploading and checking file tool
Here's the react-native code:
const recordingOptions = {
android: {
extension: ".mp4",
outputFormat: Audio.AndroidOutputFormat.MPEG_4,
audioEncoder: Audio.AndroidAudioEncoder.AAC,
sampleRate: 44100,
numberOfChannels: 2,
bitRate: 128000,
},
ios: {
extension: ".mp4",
// outputFormat: Audio.IOSOutputFormat.MPEG4AAC,
audioQuality: Audio.IOSAudioQuality.HIGH,
sampleRate: 44100,
numberOfChannels: 2,
bitRate: 128000,
},
web: {
mimeType: "audio/mp4",
bitsPerSecond: 128000 * 8,
},
};
actual implementation:
const recordingUri = recording.getURI();
const recordingBase64 = await ExpoFileSystem.readAsStringAsync(
recordingUri,
{
encoding: ExpoFileSystem.EncodingType.Base64,
}
);
const languageCode = "en"; // English
console.log(languageCode);
console.log(recordingBase64)
const buffer = Buffer.from(recordingBase64, "base64")
const blob= new Blob([buffer], { type:'audio/mp4' })
const file = new File([blob],'test.mp4', {type:'audio/mp4'})
const formData = new FormData();
formData.append('file',file);
formData.append("model", "whisper-1");
const apiUrl = "https://api.openai.com/v1/audio/transcriptions";
const requestOptions = {
method: "POST",
headers: {
Authorization: `Bearer ${OPENAI_API_KEY}`,
},
body: formData,
};
fetch(apiUrl, requestOptions)
.then((response) => response.json())
.then((data) => console.log(data))
.catch((error) => console.log(error));
and every time the response is:
{"error": {"code": null, "message": "Invalid file format. Supported formats: ['m4a', 'mp3', 'webm', 'mp4', 'mpga', 'wav', 'mpeg']", "param": null, "type": "invalid_request_error"}}
Does anyone have any idea what I'm doing wrong?