react-native using expo-av IOS recorded mp4 file form data call to openai's audio transcriptions saying invalid file format

Question

I've been at this all night, I'm attempting to record myself on my iPhone via expo using expo-av (records speech via iPhone) and upload it to openai's transcriptions endpoint using whisper-1 model.

The file is saved as mp4, I convert it to a base64 string, I have confirmed the base64 content is infact mp4:

base64 to file converting tool

uploading and checking file tool

Here's the react-native code:

  const recordingOptions = {
    android: {
      extension: ".mp4",
      outputFormat: Audio.AndroidOutputFormat.MPEG_4,
      audioEncoder: Audio.AndroidAudioEncoder.AAC,
      sampleRate: 44100,
      numberOfChannels: 2,
      bitRate: 128000,
    },
    ios: {
      extension: ".mp4",
      // outputFormat: Audio.IOSOutputFormat.MPEG4AAC,
      audioQuality: Audio.IOSAudioQuality.HIGH,
      sampleRate: 44100,
      numberOfChannels: 2,
      bitRate: 128000,
    },
    web: {
      mimeType: "audio/mp4",
      bitsPerSecond: 128000 * 8,
    },
  };

actual implementation:

const recordingUri = recording.getURI();
      const recordingBase64 = await ExpoFileSystem.readAsStringAsync(
        recordingUri,
        {
          encoding: ExpoFileSystem.EncodingType.Base64,
        }
      );
      const languageCode = "en"; // English
      console.log(languageCode);
      console.log(recordingBase64)

      const buffer = Buffer.from(recordingBase64, "base64")
      const blob= new Blob([buffer], { type:'audio/mp4' })
      const file = new File([blob],'test.mp4', {type:'audio/mp4'})



      const formData = new FormData();
      formData.append('file',file);
      formData.append("model", "whisper-1");

      const apiUrl = "https://api.openai.com/v1/audio/transcriptions";

      const requestOptions = {
        method: "POST",
        headers: {
          Authorization: `Bearer ${OPENAI_API_KEY}`,
        },
        body: formData,
      };

      fetch(apiUrl, requestOptions)
        .then((response) => response.json())
        .then((data) => console.log(data))
        .catch((error) => console.log(error));

and every time the response is:

{"error": {"code": null, "message": "Invalid file format. Supported formats: ['m4a', 'mp3', 'webm', 'mp4', 'mpga', 'wav', 'mpeg']", "param": null, "type": "invalid_request_error"}}

Does anyone have any idea what I'm doing wrong?

@MichaelCeber I ended up passing this base64 to my node backend and using the Buffer class to write the file then read it via fs.createReadStream and pass that to openai's node library. — jawn, Apr 11 '23 at 16:06
Ahh funny, as after I sent this message, thats exactly what I did, moved it off the UI and to the back end code which was reliable (my back end though is c#) - was always going to put it in the back end anyway as need to secure the api key etc there... — Michael Ceber, Apr 12 '23 at 21:07
@MichaelCeber yeah man I tried everything, something's up with the Buffer implementation on the client side for this scenario. glad you got it to work. — jawn, Apr 13 '23 at 14:38

score 3 · Answer 1 · answered Apr 07 '23 at 20:09

3

Try adding a filename to the formData.append. Something similar to this:

formData.append('file', file, 'input.mp4');

Whisper shouldn't rely on the extension, but it seems like it does.

answered Apr 07 '23 at 20:09

Radu Diță

13,476
2
30
34

Yes, I confirm this is a good answer, had the same issue and solved it like that. – acortad May 19 '23 at 12:08
I tried adding the filename with the above implementation but my code fails when defining the blob. https://stackoverflow.com/questions/76367660/react-native-using-expo-av-ios-mp4-file-openais-audio-transcriptions-invalid-fi – Ibra May 30 '23 at 18:31

react-native using expo-av IOS recorded mp4 file form data call to openai's audio transcriptions saying invalid file format

1 Answers1

Linked