Recording Audio using PyAudio vs Web Audio API

Question

I am looking for your help in building a Shazam clone using the Rapid API: https://rapidapi.com/apidojo/api/shazam/ , I am using the songs/v2/detect endpoint.

In my backend I am converting the raw audio file to byte array and base64 encoding it before calling the Rapid Hub API.

I am trying to record the audio using React and Web Audio API in my frontend code which is not working.

Here is my frontend code for your reference:

import React, { useState, useEffect } from 'react';
import axios from 'axios';

function AudioRecorder() {
  const [recording, setRecording] = useState(false);
  const [mediaRecorder, setMediaRecorder] = useState(null);
  const [audioChunks, setAudioChunks] = useState([]);

  const audioConstraints = {
    audio: {
      channelCount: 1, // Set to 1 for Mono
      sampleRate: 44100, // Set to 44100 Hz
      sampleSize: 16, // Set to 16 bits
    },
  };

  useEffect(() => {
    if (mediaRecorder) {
      mediaRecorder.addEventListener('dataavailable', handleDataAvailable);
      mediaRecorder.addEventListener('stop', handleRecordingStopped);
    }
  }, [mediaRecorder]);

  const handleDataAvailable = (event) => {
    if (event.data.size > 0) {
      setAudioChunks((prevChunks) => [...prevChunks, event.data]);
    }
  };

  const handleRecordingStopped = () => {
    setRecording(false);
  };

  const startRecording = () => {
    navigator.mediaDevices.getUserMedia(audioConstraints)
      .then((stream) => {
        const newMediaRecorder = new MediaRecorder(stream);
        setMediaRecorder(newMediaRecorder);
        setRecording(true);
        newMediaRecorder.start();
      })
      .catch((error) => {
        console.error('Error accessing microphone:', error);
      });
  };

  const stopRecording = () => {
    if (mediaRecorder) {
      mediaRecorder.stop();
    }
  };

  const handleUpload = () => {
    if (audioChunks.length > 0) {
      const combinedBlob = new Blob(audioChunks, { type: 'audio/raw' });
      const formData = new FormData();
      formData.append('audio', combinedBlob, 'recording.raw');

      axios.post('http://localhost:8080/api/v1/shazam/songDetection', formData)
        .then((response) => {
          console.log('Audio uploaded successfully:', response);
        })
        .catch((error) => {
          console.error('Error uploading audio:', error);
        });
    }
  };

  return (
    <div>
      <button onClick={startRecording} disabled={recording}>
        Start Recording
      </button>
      <button onClick={stopRecording} disabled={!recording}>
        Stop Recording
      </button>
      <button onClick={handleUpload} disabled={audioChunks.length === 0}>
        Upload Audio
      </button>
    </div>
  );
}

export default AudioRecorder;

Whereas I am using a Python script to record audio and that seems to be working in detecting the song using the Rapid Hub API.

Here is the python script for your reference:

#!/usr/bin/env python3

import pyaudio
import wave

def record_audio(output_file):
    # Configure audio recording parameters
    CHUNK = 1024  # Number of audio frames per buffer
    FORMAT = pyaudio.paInt16  # Sample format (16-bit integer)
    CHANNELS = 1  # Number of audio channels (mono)
    RATE = 44100  # Sample rate

    # Create an instance of the PyAudio class
    audio = pyaudio.PyAudio()

    # Open the audio stream for recording
    stream = audio.open(format=FORMAT,
                        channels=CHANNELS,
                        rate=RATE,
                        input=True,
                        frames_per_buffer=CHUNK)

    # Create a buffer to store the recorded audio frames
    frames = []

    # Record audio frames
    print("Recording started. Press Ctrl+C to stop.")
    try:
        while True:
            data = stream.read(CHUNK)
            frames.append(data)
    except KeyboardInterrupt:
        pass

    # Stop and close the audio stream
    stream.stop_stream()
    stream.close()
    audio.terminate()

    # Save the recorded audio as a raw audio file (PCM or WAV)
    with open(output_file, "wb") as file:
        file.write(b"".join(frames))
    print(f"Recording saved to {output_file}.")

if __name__ == '__main__':
    output_file = "recorded_audio.raw"  # Replace with your desired file name and extension
    record_audio(output_file)

Can you help me figure out what i am doing wrong in the React code?

I keep getting empty response from the API and the documentation mentions that If the result is empty, your request data must be in wrong format in most case.

Tried to update my frontend to record audio in the following format 44100Hz, 1 channel (Mono), signed 16 bit PCM little endian. Doesnt seem to work.

EDIT:

I tried the following with extendable-media-recorder

import { MediaRecorder } from 'extendable-media-recorder';
import { useEffect, useState } from 'react';
import axios from 'axios';

function RecorderAudio() {

  const audioConstraints = {
    audio: {
      sampleRate: 44100,
      channelCount: 1, // Set to 1 for Mono
      sampleSize: 16, // Set to 16 bits
    },
  };

  const [recording, setRecording] = useState(false);
  const [mediaRecorder, setMediaRecorder] = useState(null);
  const [audioChunks, setAudioChunks] = useState([]);

  useEffect(() => {
    if (mediaRecorder) {
      mediaRecorder.addEventListener('dataavailable', handleDataAvailable);
      mediaRecorder.addEventListener('stop', handleRecordingStopped);
    }
  }, [mediaRecorder]);

  const handleDataAvailable = (event) => {
    if (event.data.size > 0) {
      setAudioChunks((prevChunks) => [...prevChunks, event.data]);
    }
  };

  const handleRecordingStopped = () => {
    setRecording(false);
  };

  async function startRecording() {
    const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
    const audioContext = new AudioContext({ sampleRate: 44100 });
    const mediaStreamAudioSourceNode = new MediaStreamAudioSourceNode(audioContext, { mediaStream: stream });
    const mediaStreamAudioDestinationNode = new MediaStreamAudioDestinationNode(audioContext);
    mediaStreamAudioSourceNode.connect(mediaStreamAudioDestinationNode);
    const mediaRecorder = new MediaRecorder(mediaStreamAudioDestinationNode.stream, { type: "audio/mp3" });
    setMediaRecorder(mediaRecorder);
    setRecording(true);
    mediaRecorder.start();
  };

  const stopRecording = () => {
    if (mediaRecorder) {
      mediaRecorder.stop();
    }
  };

  const handleUpload = () => {
    if (audioChunks.length > 0) {
      const combinedBlob = new Blob(audioChunks, { type: 'audio/mp3' });
      const formData = new FormData();
      formData.append('audio', combinedBlob, 'recording.mp3');

      axios.post('http://localhost:8080/api/v1/shazam/songDetection', formData, {
        headers: {
        'Content-Type': 'multipart/form-data'}})
        .then((response) => {
          console.log('Audio uploaded successfully:', response);
        })
        .catch((error) => {
          console.error('Error uploading audio:', error);
        });
    }
  };

  return (
    <div>
      <button onClick={startRecording} disabled={recording}>
        Start Recording
      </button>
      <button onClick={stopRecording} disabled={!recording}>
        Stop Recording
      </button>
      <button onClick={handleUpload} disabled={audioChunks.length === 0}>
        Upload Audio
      </button>
    </div>
  );
};

export default RecorderAudio;

score 0 · Answer 1 · answered Jul 12 '23 at 16:36

0

It looks like you're trying to record something as 'audio/raw'. This is not supported in any browser. Right now every browser uses a different format when recording something with a MediaRecorder but it's always lossy.

I don't really know which format you're targeting but there is a comment in your Python code which says "Save the recorded audio as a raw audio file (PCM or WAV)". There are a few libraries out there which allow you to record audio as 'audio/wav' in the browser. Maybe that's what you're looking for.

One of those libraries is extendable-media-recorder (which I am the author of) but it's not the only one.

answered Jul 12 '23 at 16:36

chrisguttandin

7,025
15
21

The Shazam's `songs/v2/detect` API https://rapidapi.com/apidojo/api/shazam/ expects audio in the following format: `The raw sound data must be 44100Hz, 1 channel (Mono), signed 16 bit PCM little endian. Other types of media are NOT supported, such as : mp3, wav, etc… or need to be converted to uncompressed raw data` Can i directly record audio in raw format or do i need to record in wav format and convert it to raw? – Avnish Shetty Jul 13 '23 at 03:33
Let me give the extendable-media-recorder library a try. Thanks Chris – Avnish Shetty Jul 14 '23 at 06:49
If the audio is in 44100 Hz already you should be able to convert it from WAV to PCM by removing the first 44 bytes. That's the size of the header. – chrisguttandin Jul 15 '23 at 13:06
Hey @chrisguttandin, I tried out extendable-media-recorder in React. Seems to be not working when i call the shazam API, the response is empty so according to the documentation the format is wrong. `If the result is empty, your request data must be in wrong format in most case` Any idea how i can verify whether the format of the song recorded in correct? i.e: `44100Hz, 1 channel (Mono), signed 16 bit PCM little endian` – Avnish Shetty Jul 20 '23 at 05:07
Unfortunately there is no way to check if some binary data is PCM data or not. By definition it contains no header or other structure around the actual data. It's only the data. Therefore there should also be no way for the API that you are using to check if the uploaded data is PCM or not. It's possible to interpret any binary data as PCM data. Maybe it's because you explicitly set the mime type to `'audio/mp3'`? At least that's what the snippet that you posted above does. Maybe the API first checks the mime type before doing anything else. – chrisguttandin Jul 20 '23 at 22:22
Hey @chrisguttandin I did try mime type as audio/raw and audio/wav. In audio/wav I also deleted the first 44 bytes of the header in my backend and it still did not work – Avnish Shetty Jul 20 '23 at 23:33
I'm not entirely sure what the problem is. Maybe you can try to debug this by trying to upload one of the existing files that you recorded with your Python script from within the browser to see if that works before you start to record directly in the browser. – chrisguttandin Jul 26 '23 at 12:41

Recording Audio using PyAudio vs Web Audio API

1 Answers1