Questions tagged [mfcc]

Mel-Frequency Cepstral Coefficients. The name given to an alternate representation of speech signals based on its frequency content. Very popular way to represent a speech signal as a feature vector. Used primarily for speech recognition tasks.

Mel Frequency Cepstral Coefficients (MFCC) are coefficients obtained when a speech signal is analysed by a series of filter banks with logarithmically spaced center frequencies on the Mel-scale. This choice of center frequencies is significant because it mimics the human ear. MFCC are computed from the magnitude mel-spectrogram by log-scaling, and applying the Discrete Cosine Transform to compute the cepstrum. MFCC is very popular for speech recognition tasks.

312 questions

votes

1 answer

How to use MFCC vectors for classifying a single audio file?

This is probably very silly question, but I couldn't find details anywhere. So I have an audio recording (wav file) that is 3 seconds long. That is my sample and it needs to be classified as [class_A] or [class_B]. By following some tutroial on…

asked May 14 '13 at 14:56

nnyjoh

votes

2 answers

Library to train GMMs from MFCC

I am trying to build a basic Emotion detector from speech using MFCCs, their deltas and delta-deltas. A number of papers talk about getting a good accuracy by training GMMs on these features. I cannot seem to find a ready made package to do the…

r audio speech-recognition gaussian mfcc

asked Mar 15 '13 at 16:15

Dony George

votes

1 answer

how to use mfcc feature to train a svm classifier for voice recognition?

I am currently in the discussion phase project with voice recognition, I use the MFCC feature extraction, but the MFCC feature returned from the function is a matrix, e,g. a (20,38) feature matrix for each voice file(wav). But how can I pass this…

svm voice-recognition mfcc

asked Mar 01 '13 at 13:28

user1423164

votes

1 answer

Understanding MFCC output for a simple sine wave

I generate a simple sine wave with a frequency of 200 and calculate an FFT to check that the obtained frequency is correct. Then I calculate MFCC but do not understand what its output means? What is the explanation of the output, and where do I see…

python signal-processing librosa mfcc

asked Jun 05 '23 at 19:18

codeDom

1,623
18
54

votes

4 answers

Matching two series of Mfcc coefficients

I have extracted two series MFCC coefficients from two around 30 second audio files consisting of the same speech content. The audio files are recorded at the same location from different sources. An estimation should be made whether the audio…

matlab audio matching similarity mfcc

asked Aug 03 '11 at 19:23

Sney

2,486
4
32
48

votes

1 answer

Get timing information from MFCC generated with librosa.feature.mfcc

I am extracting MFCCs from an audio file using Librosa's function (librosa.feature.mfcc) and I correctly get back a numpy array with the shape I was expecting: 13 MFCCs values for the entire length of the audio file which is 1292 windows (in 30…

python audio librosa mfcc

asked Dec 11 '20 at 10:36

GiulioG

votes

0 answers

sklearn.exceptions.NotFittedError: This LabelEncoder instance is not fitted yet

I'm trying to run a voice recognition code from Github HERE that analyzes voice. There is an example in final_results_gender_test.ipynb that illustrates the steps both on the training and inference. So I copied and adjusted the inference part and…

python scikit-learn speech-recognition voice-recognition mfcc

asked Oct 30 '20 at 01:04

Tina J

4,983
13
59
125

votes

1 answer

python tensorflow signal processing MFCC features

I'm testing the MFCC feature from tensorflow.signal implementation. According to the example (https://www.tensorflow.org/api_docs/python/tf/signal/mfccs_from_log_mel_spectrograms), it is computing all 80 mfccs and then taking the first 13. I have…

python tensorflow audio signal-processing mfcc

asked Mar 02 '20 at 17:14

TYZ

8,466
5
29
60

votes

1 answer

Standarize a 3D NumPy array that has been padded with np.nan

I have a 3D matrix with a shape like (100, 40, 170). This matrix has been padded to reach the max length of 170 by filling up with np.nan (NaN). The values in the matrix represent MFCC coefficients from audio data extracted from the UrbanSound8K…

python numpy normalize mfcc

asked Jul 24 '19 at 20:51

Eduardo G.R.

votes

1 answer

What is the warning 'Empty filters detected in mel frequency basis. ' about?

I'm trying to extract MFCC features from an audio file with 13 MFCCs with the below code: import librosa as l x, sr = l.load('/home/user/Data/Audio/Tracks/Dev/FS_P01_dev_001.wav', sr = 8000) n_fft = int(sr * 0.02) hop_length = n_fft // 2 mfccs…

python audio feature-extraction librosa mfcc

asked Jul 08 '19 at 07:06

Prithvi Allurkar

votes

2 answers

How to get GFCC instead of MFCC in python?

Today i'm using MFCC from librosa in python with the code below. It gives an array with dimension(40,40). import librosa sound_clip, s = librosa.load(filename.wav) mfcc=librosa.feature.mfcc(sound_clip, n_mfcc=40, n_mels=60) Is there a similiar…

python audio artificial-intelligence mfcc librosa

asked May 11 '19 at 15:57

gynther

votes

2 answers

Feature Extraction using MFCC

I want to know, how to extract the audio (x.wav) signal, feature extraction using MFCC? I know the steps of the audio feature extraction using MFCC. I want to know the fine coding in Python using the Django framework

python-3.x mfcc

asked Jan 12 '19 at 13:39

Senthuja

votes

1 answer

librosa.util.exceptions.ParameterError: Invalid shape for monophonic audio: ndim=2, shape=(1025, 5341)

I am trying to separate voice from background noise in audio file using python and then extract mfcc features but I get "librosa.util.exceptions.ParameterError: Invalid shape for monophonic audio: ndim=2, shape=(1025, 5341) " error here's the…

python-3.x audio speech-recognition voice-recognition mfcc

asked Aug 08 '18 at 19:03

Mashael

votes

1 answer

TypeError: 'module' object is not callable . MFCC

Working on a project based on speaker recognition using python and getting the following error while finding MFCC. Traceback (most recent call last): File "neh1.py", line 10, in complexSpectrum = numpy.fft(signal) TypeError: 'module'…

python numpy speaker mfcc

asked Oct 19 '17 at 02:39

Neha

votes

1 answer

What are MFCC values?

So I know what is MFCC (Mel Frequency Cepstrum Coefficients). But I need to understand what each value is... Is it some sort of sound frequency value or what? Let's assume we have this kind of matrix. So each row represents the coefficients of one…

neural-network speech-recognition mfcc

asked Jun 04 '17 at 15:25

Nikas Žalias

1,594
1
23
51

Prev 1 2

…

20 21 Next