Questions tagged [audio-processing]

Audio processing involves the study of mathematical and signal processing techniques to understand or alter the nature of audio signals. The different kind of audio signals under study include speech, music, environmental audio and computer audio. Audio is analyzed in the temporal or spectral domain by applying various filters.

Key concept is to transform the audio into PCM format so you have access to the raw audio curve. Each channel will have its own curve.

Digital audio is represented by a series of points on this curve. Each point is called an audio sample. Numerical value of each sample can be represented in either integer or floating point.

Be aware to map each audio sample numerical value to memory typically requires several bytes of storage. One byte can store only 2^8 distinct values (256) which will result in noticeable distortion. High quality audio is typically stored using at least two bytes of storage per audio sample. When we use two bytes this gives us 2^16 possible values of the raw audio curve height as the audio wobbles up and down. The more bytes we use for storage the higher fidelity we gain as this reduces the gap between each distinct curve height measurement. This called bit depth. CD quality audio uses two bytes per audio sample per channel. The other fundamental aspect of digital audio is Sample Rate with determines the number of samples per second of time.

556 questions
2
votes
0 answers

What's the best approach to editing/process audio in React Native?

My goal is to able to add multiple tracks from various sources (record live audio, get existing MP3 files or external sources) and be able to overlay them, edit each mp3 file individually such as the pitch/frequency and etc.. My main problem at the…
2
votes
1 answer

Is my output of librosa MFCC correct? I think I get the wrong number of frames when using librosa MFCC

result=librosa.feature.mfcc(signal, 16000, n_mfcc=13, n_fft=2048, hop_length=400) result.shape() The signal is 1 second long with sampling rate of 16000, I compute 13 MFCC with 400 hop length. The output dimensions are (13,41). Why do I get 41…
Rasula
  • 47
  • 1
  • 5
2
votes
0 answers

How to calculate peaks of the frequencies in audio frequency spectrum with python

I have a audio file consisting of multiple frequencies, I need to find all the frequency peaks in the frequency spectrum after doing FFT. But the issue is how can I be able to set the threshold line for the peaks. enter image description here As…
astrick
  • 190
  • 1
  • 9
2
votes
1 answer

Mixing two16-bit encoded stereo PCM samples causing noise and distortion in the resulting audio

I get two different audio samples from two sources. For microphone sound: audioRecord = new AudioRecord(MediaRecorder.AudioSource.DEFAULT, 44100, AudioFormat.CHANNEL_IN_STEREO, AudioFormat.ENCODING_PCM_16BIT, …
2
votes
3 answers

PyAudio Recording and Playing Back in Real Time

I am trying to record audio from the microphone and then play that audio through the speakers. Eventually I want to modify the audio before playing it back, but I'm having trouble taking the input data and successfully play it back through the…
Andrew Pulver
  • 178
  • 12
2
votes
1 answer

What is the expected effect of using IAudioClient2::SetClientProperties on a capture client in Windows 10?

The specification of IAudioClient2::SetClientProperties contains only one parameter but is it not clear to me what to expect from the API given the existing documentation. The parameter is given by: typedef struct AudioClientProperties { UINT32 …
2
votes
1 answer

How to split midi file based on notes pitch using python?

How can I split midi files based on their pitch using python? I tried the music21 library, but I don't really know how this works...
SushiWaUmai
  • 348
  • 6
  • 20
2
votes
1 answer

Manipulating audio buffers in real time - Python 3.7

Using "Sound device" library I built a python 3.7 program that receives a buffer (512 samples) from an audio device, processes it and sends out to the same device. The "audio device" is an audio card, so one can connect a microphone, process the…
2
votes
1 answer

How to resolve an invalid shape for monophonic audio

I've loaded a model for testing on Jupyter Notebook and created a path for the wav audio file I plan on testing: anger = "C:\\Desktop\\Emotion Speech Recognition\\D_10\\10ANG_XX.wav" However after I've extracted the features from another Python…
2
votes
0 answers

WAV file written with AudioSegment.export() sounds half the speed as when rewriting the file with Soundfile.write

I am currently processing some audio data. I have an audio file that I have created from splitting a larger file on silence using pydub. However, if I take this audio file after exporting it with pydub, and then convert the AudioSegment's array to…
Coldchain9
  • 1,373
  • 11
  • 31
2
votes
1 answer

Octave 'wavread' undefined

I have GNU Octave 5.2.0, and I want to use it to analyze the IQ data in a wav file. This link describes a function called wavread which I can use in Octave, but when I run y = wavread(filename), I get this error message: error: 'wavread' undefined…
Daniel C Jacobs
  • 691
  • 8
  • 18
2
votes
0 answers

C# Convolution algorithm is producing a very loud .wav file

I am trying to create a convolution reverb algorithm that takes a sound input signal and convolves it in the frequency domain with an impulse response. I have been trying to debug the code for a week and I cannot seem to find where the error is. The…
M.Dyrholm
  • 21
  • 1
2
votes
0 answers

Unable to load an audio file recorded using webRTC in django

I am currently working on a django project, in which i need to record the users audio and sent it to the server for processing. i have successfully recorded the audio using webRTC and passed it to views using an ajax request. But when i try to load…
2
votes
2 answers

iPhone AudioQueue - Reading incoming audio data to determine BPM

I'm trying to determine Beats Per Minute (BPM) from the microphone using sound energy, I think I've figured out the part determining BPM but having a little trouble obtaining the RAW data. The example is based on Apples SpeakHere app - on the…
Josh
  • 83
  • 6
2
votes
3 answers

Merging Multiple mp3 audio files with python

How do i merge multiple mp3(not wav) files with Python's pyaudio . I found a few ways to merge mp3 files with other applications or languages. For instance I found mp3wrap application.And this in go language . But how do i do this in Python . When…
munir.aygun
  • 414
  • 1
  • 4
  • 11