Questions tagged [mfcc]

Mel-Frequency Cepstral Coefficients. The name given to an alternate representation of speech signals based on its frequency content. Very popular way to represent a speech signal as a feature vector. Used primarily for speech recognition tasks.

Mel Frequency Cepstral Coefficients (MFCC) are coefficients obtained when a speech signal is analysed by a series of filter banks with logarithmically spaced center frequencies on the Mel-scale. This choice of center frequencies is significant because it mimics the human ear. MFCC are computed from the magnitude mel-spectrogram by log-scaling, and applying the Discrete Cosine Transform to compute the cepstrum. MFCC is very popular for speech recognition tasks.

312 questions

votes

1 answer

Spectrograms generated using Librosa don't look consistent with Kaldi?

I generated spectrogram of a "seven" utterance using the "egs/tidigits" code from Kaldi, using 23 bins, 20kHz sampling rate, 25ms window, and 10ms shift. Spectrogram appears as below visualized via MATLAB imagesc function: I am experimenting with…

asked Apr 05 '17 at 21:05

kashkar

votes

0 answers

Meaning of MFCC

I have a conceptual problem. I know what is a mel scale and what it represent and I know that this kind of spectrogram still has too much information for what I need. I think that if we want reduce the number of information of the spectrogram we use…

audio mfcc

asked Nov 26 '15 at 23:15

Anthos89

votes

1 answer

Train speech HMM from MFCC with Matlab hmmtrain

I read many articles on this but I just do not understand how I have to proceed. I'm trying to build a basic Speech recognition system using the MFCC features to the HMM , I'm using the data available here. I'm using Matlab to do this. So far I…

matlab signal-processing speech-recognition hidden-markov-models mfcc

asked Jan 27 '15 at 09:42

Josyula Krishna

1,075
1
11
22

votes

2 answers

Fastest method of MFCC extraction on linux machine

What is the fastest way of extracting mfcc from audio files in linux (Raspberry Pi in my case). I tried sphinx3 but it was slow for large files (on Raspberry Pi). SFS (speech filing system) was quite fast on windows but i could not install it on…

signal-processing speech-recognition raspberry-pi mfcc

asked Dec 19 '13 at 09:16

Ironclad

votes

3 answers

Use libxtract or other small C, C++ library for VAD functionality

I try to create speaker identification system on Android. Currently I'm using libxtract to calculate MFCC vector from frames and libsvm for classify. Do you have any idea how to use libxtract or other small C, C++ library that I can compile under…

voice mfcc

asked Sep 11 '13 at 09:24

Jack

votes

1 answer

Is there any MFCC library can be used in android?

My team is making a emotion-recognition in speech app. To get mfcc, we use comirva package. The problem is that AudioInputStream needed to create AudioPreProcessor can't be used in android. So we have been finding some kind of alternative. Is there…

android speech-recognition mfcc

asked Sep 04 '12 at 05:37

joejo

votes

0 answers

Why does applying the hamming window to framed data show a consistent difference in behavior between python and C?

This is the code I wrote in python that extracts data from a .wav file, applies pre-emphasis, divide into frames of 0.025ms with 0.010 stride, and applies a hamming window: import scipy.io.wavfile as wavfile import numpy as np samplerate, data =…

python c signal-processing wav mfcc

asked Apr 21 '23 at 05:48

FloopyBeep

votes

0 answers

Mel-spectrogram vs MFCC for Automatic Speech Recognition

I am trying to do Automatic Speech Recognition using CNN. For the feature extraction I am using MFCC. I have read many articles, some of them say with lot of data and classifiers like CNN, mel spectorgram are better while others say MFCC is…

conv-neural-network speech-recognition spectrogram mfcc

asked Aug 24 '22 at 22:41

A. Gehani

votes

1 answer

Is my output of librosa MFCC correct? I think I get the wrong number of frames when using librosa MFCC

result=librosa.feature.mfcc(signal, 16000, n_mfcc=13, n_fft=2048, hop_length=400) result.shape() The signal is 1 second long with sampling rate of 16000, I compute 13 MFCC with 400 hop length. The output dimensions are (13,41). Why do I get 41…

python audio librosa audio-processing mfcc

asked Jul 01 '21 at 16:47

Rasula

votes

0 answers

MFCC Normalization for DL

I have a dataset containing MFCC features as input for a deep learning model. Now when I look at my mfccs they have large varying ranges of values (e.g. (-100,200),(0,5),(-1,1),...). Now I would like to normalize them for my model to be suited for…

python tensorflow keras audio mfcc

asked Jan 23 '21 at 10:33

Flitschi

votes

1 answer

What are the components of the Mel mfcc

In looking at the output of this line of code: mfccs = librosa.feature.mfcc(y=librosa_audio, sr=librosa_sample_rate, n_mfcc=40) print("MFCC Shape = ", mfccs.shape) I get a response of MFCC Shape = (40,1876). What do these two numbers represent? I…

librosa mfcc

asked Dec 08 '20 at 20:40

Joe

votes

2 answers

Relation between hop_length, win_length, frame_length, n_fft, no.of frames

I am working with mfcc features in Python via librosa: mfccs = librosa.feature.mfcc(y=y,sr=sr,n_mfcc=12,n_fft=320,hop_length=320,htk=True) Here, I took audio signal of 1s duration which gave me len(y) = 16000, hence I took sr = 16000. I calculated…

python librosa mfcc

asked Aug 08 '20 at 18:06

Pranaswi Reddy

votes

1 answer

Librosa's inverse mel spectrogram to stft taking a long time

I am currently trying to convert a mel spectrogram back into an audio file, however, librosa's mel_to_stft function is taking a long time (upwards to 15 minutes) to read in a 30 second .wav file sampled at 384kHz. The following is my code: # Code…

python audio spectrogram librosa mfcc

asked Aug 07 '20 at 02:34

Sam

votes

1 answer

What is the second number in the MFCCs array?

When I extract MFCCs from an audio the ouput is (13, 22). What does the number represent? Is it time frames ? I use librosa. The code is use is: mfccs = librosa.feature.mfcc(y=X, sr=sample_rate, n_mfcc=13,…

python audio librosa mfcc

asked Jul 04 '20 at 09:01

ioan_bl

votes

2 answers

What are the differences between MFCC and BFCC?

I have implemented MFCC algorithm and want to implement BFCC. What are the differences between them and is it enough just to use another function instead of frequency to mel (2595 * Math.log10(1 + frequency / 700) ) and mel to frequency functions…

java algorithm signal-processing mfcc

asked Jun 02 '11 at 14:44

kamaci

72,915
69
228
366

Prev 1 2 3

…

20 21 Next