Questions tagged [mfcc]

Mel-Frequency Cepstral Coefficients. The name given to an alternate representation of speech signals based on its frequency content. Very popular way to represent a speech signal as a feature vector. Used primarily for speech recognition tasks.

Mel Frequency Cepstral Coefficients (MFCC) are coefficients obtained when a speech signal is analysed by a series of filter banks with logarithmically spaced center frequencies on the Mel-scale. This choice of center frequencies is significant because it mimics the human ear. MFCC are computed from the magnitude mel-spectrogram by log-scaling, and applying the Discrete Cosine Transform to compute the cepstrum. MFCC is very popular for speech recognition tasks.

312 questions

votes

1 answer

How can I obtain the raw audio frames from the microphone in real-time or from a saved audio file in iOS?

I am trying to extract MFCC vectors from the audio signal as input into a recurrent neural network. However, I am having trouble figuring out how to obtain the raw audio frames in Swift using Core Audio. Presumably, I have to go low-level to get…

asked Dec 01 '17 at 22:38

macklinagent

votes

1 answer

Librosa: Cannot provide window function for mfcc on Windows

I'm currently experimenting with librosa to reproduce an scientific approach (deep learning) that used PRAAT to extract the MFCCs of audio files. I'm not that experienced with phonetics/acoustics and I had a lot of issues understanding PRAAT - so I…

python scipy anaconda librosa mfcc

asked Nov 12 '19 at 18:20

Keanri

votes

2 answers

Comparing MFCC feature vectors with DTW

I'm looking for some advice on Dynamic Time Warping (DTW). I have a Python script and extract Mel-Frequency Cepstral Coefficient (MFCC) feature vectors from .WAV files of various lengths. The feature vectors are arrays of varying lengths that…

audio speech-recognition dynamic-programming mfcc dtw

asked Jan 15 '18 at 16:38

amitchone

1,630
3
21
45

votes

2 answers

How to get MFCC with TarsosDSP?

I searched everywhere and I couldn't figure out how to extract MFCC feature using TarsosDSP on Android. I know how to get FFT out of a file. Any help?

android mfcc tarsosdsp

asked Nov 05 '16 at 08:06

Hassan Pezeshk

votes

4 answers

Librosa mel filter bank decreasing triangles

I'm a bit stuck understanding MFCCs. From what I have read the mel filter banks should be a series of triangles that get wider and their peaks are at the same place. Like this... However when I compute the mel filter banks using librosa I…

matplotlib mfcc librosa

asked Oct 22 '16 at 21:07

Jack Deadman

votes

2 answers

How to compare two MFCC feature vector or similarity between the MFCC feature vector of two speech utterances

I have extracted 13 MFCC features of two utterances. Feature set for first utterance is of size 11*13 and other is 18*13. So, how to compare two feature sets to find the similarity between these two words? I am not using any classifier, if someone…

speech-recognition text-to-speech mfcc

asked Sep 20 '14 at 14:16

Pravin Ramteke

votes

1 answer

MATLAB code for calculating MFCC

I have a question if that's ok. I was recently looking for algorithm to calculate MFCCs. I found a good tutorial rather than code so I tried to code it by myself. I still feel like I am missing one thing. In the code below I took FFT of a signal,…

matlab signal-processing speech-recognition mfcc

asked Nov 19 '13 at 13:41

Celdor

2,437
2
23
44

votes

1 answer

How to train HMM with audio senteces dataset for speech recognition?

I have read some journals and paper of HMM and MFCC but i still got confused on how it works step by step with my dataset (audio of sentences dataset). My data set Example (Audio Form) : hello good morning good luck for you exam etc about 343 audio…

python tensorflow speech-recognition mfcc hmmlearn

asked Jul 04 '18 at 03:22

MarcellSinaga

votes

1 answer

How to combine mfcc vector with labels from annotation to pass to a neural network

Using librosa, I created mfcc for my audio file as follows: import librosa y, sr = librosa.load('myfile.wav') print y print sr mfcc=librosa.feature.mfcc(y=y, sr=sr) I also have a text file that contains manual annotations[start, stop, tag]…

python neural-network keras mfcc librosa

asked Jan 22 '18 at 19:07

DJ_Stuffy_K

votes

1 answer

Python implementation of MFCC algorithm

I have a database which contains a videos streaming. I want to calculate the LBP features from images and MFCC audio and for every frame in the video I have some annotation. The annotation is inlined with the video frames and the time of the video.…

python python-2.7 signals mfcc

asked Nov 27 '17 at 14:04

konstantin

votes

5 answers

is it possible to get exactly the same results from tensorflow mfcc and librosa mfcc?

I'm trying to make tensorflow mfcc give me the same results as python lybrosa mfcc i have tried to match all the default parameters that are used by librosa in my tensorflow code and got a different result this is the tensorflow code that i have…

audio tensorflow mfcc librosa

asked Nov 01 '17 at 13:49

Eli Leszczynski

votes

1 answer

How does mfcc feature size affect recurent neural network

So I'm learning machine learning and wanted to know how does mfcc feature size affect on RNN (Recurent Neural Network)? With librosa I extracted mfcc and then delta coefficients and after that I get array of dimension [13, sound_length] The code of…

python machine-learning recurrent-neural-network mfcc librosa

asked Jan 10 '17 at 03:31

Nikas Žalias

1,594
1
23
51

votes

1 answer

Python audio signal classification MFCC features neural network

I am trying to classify audio signals from speech to emotions. For this purpose I am extracting MFCC features of the audio signal and feed them into a simple neural network (FeedForwardNetwork trained with BackpropTrainer from PyBrain).…

python audio neural-network classification mfcc

asked Aug 31 '15 at 05:22

cowhi

2,165
2
17
21

votes

2 answers

How to get mfcc features with octave

My goal is to create program on octave that loads audio file (wav, flac), calculates its mfcc features and serve them as output. The problem is that I do not have much experience with octave and cannot get octave load the audio file and that is why…

signal-processing octave speech-recognition mfcc

asked May 31 '15 at 16:47

nstanchev

votes

1 answer

How to Extract MFCC features in Java

I am working on converting a speech recognition project from MATLAB to Java code. I have been able to read the .wav files (as vectors of values in the range -1 to 1) using the java example provided here. This works exactly as the wavread function in…

java matlab audio feature-extraction mfcc

asked May 15 '14 at 05:01

Tryxo

Prev 1

…

20 21 Next