Questions tagged [audio-processing]

Audio processing involves the study of mathematical and signal processing techniques to understand or alter the nature of audio signals. The different kind of audio signals under study include speech, music, environmental audio and computer audio. Audio is analyzed in the temporal or spectral domain by applying various filters.

Key concept is to transform the audio into PCM format so you have access to the raw audio curve. Each channel will have its own curve.

Digital audio is represented by a series of points on this curve. Each point is called an audio sample. Numerical value of each sample can be represented in either integer or floating point.

Be aware to map each audio sample numerical value to memory typically requires several bytes of storage. One byte can store only 2^8 distinct values (256) which will result in noticeable distortion. High quality audio is typically stored using at least two bytes of storage per audio sample. When we use two bytes this gives us 2^16 possible values of the raw audio curve height as the audio wobbles up and down. The more bytes we use for storage the higher fidelity we gain as this reduces the gap between each distinct curve height measurement. This called bit depth. CD quality audio uses two bytes per audio sample per channel. The other fundamental aspect of digital audio is Sample Rate with determines the number of samples per second of time.

556 questions
3
votes
1 answer

How to change the pitch of a .wav file in Android?

Can somebody tell me how to change the pitch of a wave file in Android?
Sukitha Udugamasooriya
  • 2,268
  • 1
  • 35
  • 56
3
votes
1 answer

Peak separation with deconvolution in audio signal processing

I am trying to develop an algorithm separates instrumental notes in music files. C#, C++ DLL's used. I've spent pretty long time to achieve it. So what I've done so far is: Perform a specialized FFT on PCM(it gives high resolutions both in time and…
Laie
  • 540
  • 5
  • 14
3
votes
1 answer

Drum sound recognition algorithms

I am thinking of trying to make program that will automatically generate drum tabs using an audio file containing only the drumming. I have thought of using FFT to get an average spectrum peaks during a xxxx ms interval and then compare that to a…
Eqric
  • 55
  • 5
3
votes
1 answer

How to test sound level rms algorithm

My app. is calculating noise level and peak of frequency of input sound. I used FFT to get array of shorts[] buffer , and this is the code : bufferSize = 1024, sampleRate = 44100 int bufferSize = AudioRecord.getMinBufferSize(sapleRate, …
Fareed
  • 560
  • 2
  • 7
  • 23
3
votes
1 answer

How to compute zero crossing rate of signal?

I'd like get zero crossing rate of an audio signal. I tried to write the code of this formula: But I dont excatly understand whole formula. To process my code I split the signal by blocks, I mean "frame blocking". For example each lenght of blocks…
Cengaver
  • 87
  • 2
  • 9
3
votes
3 answers

Is possible to acces the waveform of a song from a spotify app?

I am thinking on how to build an spotify app that does beat detection (extract bpm of a song). For that I need to access the raw audio, the waveform, and analyze it. I am new to building spotify apps. I know that with "libspotify" you can access raw…
wizbcn
  • 1,064
  • 1
  • 12
  • 19
3
votes
3 answers

Timing in C# real time audio analysis

I'm trying to determine the "beats per minute" from real-time audio in C#. It is not music that I'm detecting in though, just a constant tapping sound. My problem is determining the time between those taps so I can determine "taps per minute" I…
zac
  • 31
  • 1
  • 3
3
votes
1 answer

Breaking a video into frames with python

I am trying to write a program that deletes frames of a video that don't have a particular symbol in them. My general plan: Split the audio from the video Split the video into frames Run the frames through a subroutine that looks for the symbol, by…
2
votes
2 answers

How to best determine volume of a signal?

I want to determine the volume of an audio signal. I have found two options: Compute Root Mean Squared of the amplitude find the maximum amplitude Are there advantages to using #1 or #2? Here is what I am trying to do: I want my Android to…
gregm
  • 12,019
  • 7
  • 56
  • 78
2
votes
2 answers

How to detect the voice from an audio stream

I need to determine when someone speaks in an audio stream. I applied the Hamming window and calculated the FFT. How do i detect the human voice from here?
user1019710
  • 321
  • 5
  • 14
2
votes
1 answer

"ValueError: x and y must have same first dimension" when trying to plot the signal amplitude of a wav file using Python

Using Python, I am trying to plot the signal amplitude of a wav file, however I am getting the following error "ValueError: x and y must have same first dimension". Here is my code: import wave import matplotlib.pyplot as plt import numpy as…
Jay
  • 21
  • 3
2
votes
1 answer

How to turn a numpy array (mic/loopback input) into a torchaudio waveform for a PyTorch classifier

I am currently working on training a classifier with PyTorch and torchaudio. For this purpose I followed the following tutorial: https://towardsdatascience.com/audio-deep-learning-made-simple-sound-classification-step-by-step-cebc936bbe5 This all…
Jalau
  • 303
  • 1
  • 2
  • 11
2
votes
0 answers

Removing Polyphony from Audio

here's an interesting question: Suppose I have an audio recording of a C major chord (C-E-G) played on the piano, which I would like to separate into three separate audio files - one with only C, one with only E, and one with only G playing (even a…
2
votes
1 answer

Extracting Instrument Qualities From Audio Signal

I'm looking to write a function that takes an audio signal (assuming it contains a single instrument playing), out of which I would like to extract the instrument-like features out of the audio and into a vector space. So in theory, if I had two…
2
votes
0 answers

Set AudioWorkletProcessor input to Float32Array

I have been trying to extract input from one AudioWorkletProcessor using postMessage, then insert that input into another AudioWorkletProcessor. I managed to get the Float32Array into the second AudioWorkletProcessor process method but RingBuffer…