Questions tagged [audio-processing]

Audio processing involves the study of mathematical and signal processing techniques to understand or alter the nature of audio signals. The different kind of audio signals under study include speech, music, environmental audio and computer audio. Audio is analyzed in the temporal or spectral domain by applying various filters.

Key concept is to transform the audio into PCM format so you have access to the raw audio curve. Each channel will have its own curve.

Digital audio is represented by a series of points on this curve. Each point is called an audio sample. Numerical value of each sample can be represented in either integer or floating point.

Be aware to map each audio sample numerical value to memory typically requires several bytes of storage. One byte can store only 2^8 distinct values (256) which will result in noticeable distortion. High quality audio is typically stored using at least two bytes of storage per audio sample. When we use two bytes this gives us 2^16 possible values of the raw audio curve height as the audio wobbles up and down. The more bytes we use for storage the higher fidelity we gain as this reduces the gap between each distinct curve height measurement. This called bit depth. CD quality audio uses two bytes per audio sample per channel. The other fundamental aspect of digital audio is Sample Rate with determines the number of samples per second of time.

556 questions
8
votes
1 answer

Using Mutagen to process all accepted file types

What do I need to do in order to process every file type accepted by mutagen, .ogg, .apev2, .wma, flac, mp4, and asf? (I excluded mp3 because it has the most documentation on it) I'd appreciated if someone who know how this is done could provide…
lzc
  • 1,645
  • 5
  • 27
  • 41
8
votes
1 answer

Android Audio effect on wav file and save it

Requirement Android open a .wav file in sd card, play it , add some effect (like echo, pitch shift etc), save the file with effect. Simple :( What I know I can open and play file using Soundpool or MediaPlayer. I can give some effect while playing…
7
votes
1 answer

How to train a machine learning algorithm using MFCC coefficient vectors?

For my final year project i am trying to identify dog/bark/bird sounds real time (by recording sound clips). I am using MFCC as the audio features. Initially i have extracted altogether 12 MFCC vectors from a sound clip using jAudio library. Now I'm…
7
votes
1 answer

AVFoundation audio processing using AVPlayer's MTAudioProcessingTap with remote URLs

There is precious little documentation on AVAudioMix and MTAudioProcessingTap, which allow processing to be applied to the audio tracks (PCM access) of media assets in AVFoundation (on iOS). This article and a brief mention in a WWDC 2012 session…
jbat100
  • 16,757
  • 4
  • 45
  • 70
7
votes
2 answers

Transforming Audio Samples From Time Domain to Frequency Domain

as a software engineer I am facing with some difficulties while working on a signal processing problem. I don't have much experience in this area. What I try to do is to sample the environmental sound with 44100 sampling rate and for fixed size…
vaha
  • 299
  • 5
  • 12
7
votes
4 answers

AVAudioPlayer rate

So I'm trying to play a sound file at a different rate in iOS 5.1.1, and am having absolutely no luck. So far I have tried setting the rate of the AVAudioPlayer: player = [[AVAudioPlayer alloc] initWithContentsOfURL:referenceURL…
user293895
  • 1,465
  • 3
  • 22
  • 39
7
votes
4 answers

Python NumPy - FFT and Inverse FFT?

I've been working with FFT, and I'm currently trying to get a sound waveform from a file with FFT, (modify it eventually), but then output that modified waveform back to a file. I've gotten the FFT of the soundwave and then used an inverse FFT…
SolarLune
  • 1,229
  • 3
  • 12
  • 14
6
votes
1 answer

Python find audio frequency and amplitude over time

Here is what I would like to do. I would like to find the audio frequency and amplitude of a .wav file at every say 1ms of that .wav file and save it into a file. I have graphed frequency vs amplitude and have graphed amplitude over time but I…
Taylor
  • 61
  • 1
  • 2
6
votes
1 answer

why my 8kHz wav file's mel feature extracted differently in sr = 16kHz and 44.1kHz

I'm currently extracting mel features from my baby cry sound dataset and the wav files' sampling rate is 8kHz, 16bit, mono and about 7 sec. Mel-Spectogram when sr = 16000 Mel-Spectogram when sr = 44100 But as you can see, whenever I extract…
valentineday
  • 69
  • 1
  • 7
6
votes
2 answers

Best way to verify an mp3 file with python

I have to detect whether a file is a valid mp3 file. So far, I have found two solutions, including: this solution from Peter Carroll using try-catch expression: try: _ = librosa.get_duration(filename=mp3_file_path) return True except: …
xtluo
  • 1,961
  • 18
  • 26
6
votes
1 answer

How can I obtain the raw audio frames from the microphone in real-time or from a saved audio file in iOS?

I am trying to extract MFCC vectors from the audio signal as input into a recurrent neural network. However, I am having trouble figuring out how to obtain the raw audio frames in Swift using Core Audio. Presumably, I have to go low-level to get…
macklinagent
  • 75
  • 1
  • 6
6
votes
0 answers

playing low frequency heartbeat signal through mobile speaker

I am making an app to listen to heartbeat . I could listen to filtered heartbeat signal through headset but not through mobile speaker as the mobile speaker doesn't support such low frequencies , I tried frequency shifting , but it results in a…
pavan
  • 91
  • 3
6
votes
3 answers

Correct way to Convert 16bit PCM Wave data to float

I have a wave file in 16bit PCM form. I've got the raw data in a byte[] and a method for extracting samples, and I need them in float format, i.e. a float[] to do a Fourier Transform. Here's my code, does this look right? I'm working on Android so…
fredley
  • 32,953
  • 42
  • 145
  • 236
6
votes
2 answers

How to determine if an audio track is a Dolby Pro Logic II mixdown

I'm trying to find out if there's a way to determine if an AAC-encoded audio track is encoded with Dolby Pro Logic II data. Is there a way of examining the file such that you can see this information? I have for example encoded a media file in…
WheresWardy
  • 375
  • 3
  • 22
6
votes
1 answer

Modify volume gain on audio sample buffer

I want to increase a volume on buffer with voice data. The point is I'm using DirectSound and I have one primary and one secondary buffer - all streams mixing is done by hand. In a voice chat all participants can have independent volume levels. I…
Dalamber
  • 1,009
  • 1
  • 12
  • 32
1 2
3
37 38