Questions tagged [audio-processing]

Audio processing involves the study of mathematical and signal processing techniques to understand or alter the nature of audio signals. The different kind of audio signals under study include speech, music, environmental audio and computer audio. Audio is analyzed in the temporal or spectral domain by applying various filters.

Key concept is to transform the audio into PCM format so you have access to the raw audio curve. Each channel will have its own curve.

Digital audio is represented by a series of points on this curve. Each point is called an audio sample. Numerical value of each sample can be represented in either integer or floating point.

Be aware to map each audio sample numerical value to memory typically requires several bytes of storage. One byte can store only 2^8 distinct values (256) which will result in noticeable distortion. High quality audio is typically stored using at least two bytes of storage per audio sample. When we use two bytes this gives us 2^16 possible values of the raw audio curve height as the audio wobbles up and down. The more bytes we use for storage the higher fidelity we gain as this reduces the gap between each distinct curve height measurement. This called bit depth. CD quality audio uses two bytes per audio sample per channel. The other fundamental aspect of digital audio is Sample Rate with determines the number of samples per second of time.

556 questions
6
votes
1 answer

Verizon SongID - How is it programmed?

For anyone not familiar with Verizon's SongID program, it is a free application downloadable through Verizon's VCast network. It listens to a song for 10 seconds at any point during the song and then sends this data to some all-knowing algorithmic…
CheeseConQueso
  • 5,831
  • 29
  • 93
  • 126
6
votes
1 answer

What kind of sound processing algorithm allows you to make visualizations like this?

I'm interested in making an OpenGL visualizer for MP3's as a pet project. I stumbled upon this youtube video which demonstrates someone showing off a visualizer being used in conjunction with Augmented…
Gaius
  • 349
  • 1
  • 4
  • 5
6
votes
6 answers

extracting a specific melody/beat/rhythm from a specific instument from a mixed wave (or other music format) file

Is it possible to write a program that can extract a melody/beat/rhythm provided by a specific instument in a wave (or other music format) file made up of multiple instruments? Which algorithms could be used for this and what programming language…
Shane
  • 61
  • 1
  • 2
6
votes
1 answer

Aubio for BPM tracking on Android

I am working on a android audio project which requires BPM tracking. I decided that writing my own would not be a good idea and after looking around, I found a few libraries that does BPM tracking such as aubio, vamp, echonest etc. Out of the lot…
Gan
  • 1,349
  • 2
  • 10
  • 27
5
votes
1 answer

Analyzing audio to create Guitar Hero levels automatically

I'm trying to create a Guitar-Hero-like game (something like this) and I want to be able to analyze an audio file given by the user and create levels automatically, but I am not sure how to do that. I thought maybe I should use BPM detection…
Symbol
  • 145
  • 3
  • 10
5
votes
2 answers

Help with implementing this beat-detection algorithm?

I recently tried to implement a beat detection code found here, namely the Derivation and Combfilter algorithm #1:: http://archive.gamedev.net/reference/programming/features/beatdetection/page2.asp Im not too sure if I implemented it successfully as…
dspboy
  • 91
  • 2
5
votes
1 answer

Does Julia have support for audio processing

I want to play around with audio at low level. I want functionality such as reading mp3 files and creating audio files (with both channels independently controllable). The ability to listen to generated audio in the code notebook (I am using Pluto)…
5
votes
1 answer

Understanding the shape of spectrograms and n_mels

I am going through these two librosa docs: melspectrogram and stft. I am working on datasets of audio of variable lengths, but I don't quite get the shapes. For example: (waveform, sample_rate) = librosa.load('audio_file') spectrogram =…
swe87
  • 129
  • 1
  • 3
  • 13
5
votes
1 answer

FFT of data received from PyAudio gives wrong frequency

My main task is to recognize a human humming from a microphone in real time. As the first step to recognizing signals in general, I have made a 5 seconds recording of a 440 Hz signal generated from an app on my phone and tried to detect the same…
5
votes
2 answers

Android Microphone To Pick Up A Specific Tone

Hello I was wondering if using the android tone generator class would it be possible to create a tone in one device and listen for this same tone in another device. If this is possible I do have a few other questions. Taking backround noise into…
Keith
  • 420
  • 2
  • 5
  • 12
5
votes
3 answers

Scipy io read wavfile error

Whenever I try to read a .wav file, the following error comes. I have searched everywhere but had no progress upon this. CODE: import scipy as sp import matplotlib.pyplot as plt sr, y = sp.io.wavfile.read(MY_FILENAME) print sr ERROR: File…
user5722540
  • 590
  • 8
  • 24
5
votes
1 answer

AVAudioRecorder in Swift 3: Get Byte stream instead of saving to file

I am new to iOS programming and I want to port an Android app to iOS using Swift 3. The core functionality of the app is to read the byte stream from the microphone and to process this stream live. So it is not sufficient to store the audio stream…
Simon Hessner
  • 1,757
  • 1
  • 22
  • 49
5
votes
1 answer

Chord Detection Algorithm with the Web Audio API

First off I'm trying to implement this chord detection algorithm: http://www.music.mcgill.ca/~jason/mumt621/papers5/fujishima_1999.pdf I originally implemented the algorithm to use my microphone, but it didn't work. As a test I created three…
5
votes
1 answer

Vector Quantization in Speech Processing Explanation

I'm having trouble determining from this research paper exactly how I can reproduce the Standard Vector Quantization algorithm to determine the language of an unidentified speech input, based on a training set of data. Here's some basic…
atp
  • 30,132
  • 47
  • 125
  • 187
5
votes
0 answers

Algorithm for real-time convolution reverb using long impulse responses?

I am attempting to program an audio application in C# and need to implement a real-time convolution reverb processor. The method I am currently using is breaking down when using impulse responses of length above ~16,000 samples at 44.1kHz. I need to…