Questions tagged [mfcc]

Mel-Frequency Cepstral Coefficients. The name given to an alternate representation of speech signals based on its frequency content. Very popular way to represent a speech signal as a feature vector. Used primarily for speech recognition tasks.

Mel Frequency Cepstral Coefficients (MFCC) are coefficients obtained when a speech signal is analysed by a series of filter banks with logarithmically spaced center frequencies on the Mel-scale. This choice of center frequencies is significant because it mimics the human ear. MFCC are computed from the magnitude mel-spectrogram by log-scaling, and applying the Discrete Cosine Transform to compute the cepstrum. MFCC is very popular for speech recognition tasks.

312 questions
2
votes
1 answer

How to use MFCCs in Weka for audio classification?

I am trying to develop a method to classify audio using MFCCs in Weka. The MFCCs I have are generated with a buffer size of 1024, so there is a series of MFCC coefficients for each audio recording. I want to convert these coefficients into the ARFF…
CCCodes
  • 145
  • 3
  • 14
2
votes
1 answer

HTK - What do MFCCs of an HMM model and Input WAV File represent?

While creating MFCCs following Voxforge's tutorial for a Speech to Text System using HTK (Hidden Markov Model Tool Kit), we are required to define a prototype model for our phones. I am trying to wrap my head around this this file. ~o 25…
Ajay H
  • 794
  • 2
  • 11
  • 28
2
votes
0 answers

ValueError: string size must be a multiple of element size

i am classifying two classes of audio samples(.wav files). Below is the sample code where error arises, > for curr_class in classes: > dirname = os.path.join(data_dir_train, curr_class) > for fname in os.listdir(dirname): > with…
Nikhil
  • 21
  • 1
  • 4
2
votes
0 answers

Android sound processing

I have a Java application that computes the MFCC coefficients of an audio file by reading it into an AudioInputStream object and then writing it into a Byte array. public double[] convertsignal(File signalfile) throws UnsupportedAudioFileException,…
AkshayeAP
  • 17
  • 3
2
votes
1 answer

Error while importing scikits.talkbox

I want to use scikits.talkbox, but i get the following error while import scikits.talkbox. Traceback (most recent call last): File "/home/seref/Desktop/machine learning codes/MFCC/main.py", line 3, in from scikits.talkbox.features.mfcc…
2
votes
0 answers

What to do with MFCC?

I'm currently trying to implement a simple word recognization from a standard microphone with Python. I already sampled data and extract MFCC matrix from the audio signal. But the question is : What should I do with these features to get phonems or…
Stro
  • 21
  • 1
2
votes
1 answer

Next steps to do with the mfccs, in voice recognition web based

I am working on urdu (language spoken in pakistan, india, bangladesh) voice recognition to translate urdu speech into urdu words. So far i did nothing but just have found meyda javascript library for extracting mfccs from data frames. Some document…
shahid
  • 21
  • 2
2
votes
1 answer

Librosa : MFCC feature calculation

Given a audio file of 22 mins (1320 secs), Librosa extracts a MFCC features by data = librosa.feature.mfcc(y=None, sr=22050, S=None, n_mfcc=20, **kwargs) data.shape (20,56829) It returns numpy array of 20 MFCC features of 56829 frames…
Rangooski
  • 825
  • 1
  • 11
  • 29
2
votes
0 answers

Python with MFCC Features to train SVM using Numpy

I'm having problems particularly with numpy. For testing purposes, I'm trying to train MFCC's of two wav files. Both of the array sizes are the same. When I'm trying to fit the data into classifier I'm having ValueError: Found array with dim 3.…
Ugur
  • 184
  • 1
  • 16
2
votes
0 answers

Dealing with different sized MFCC vectors as training data

I am working on a project where I am classifying coughs of a patient as either positive or negative for a certain pulmonary illness. What I have at the moment is multiple cough events, segmented from larger recordings. I have extracted various…
2
votes
1 answer

Compare two spoken words with MFCC and DTW using Aquila library

I am trying to find the similarity between spoken words using aquila library. my current approach is as follows. 1) First i break down the spoken word into smaller frames. 2) then apply MFCC for each frame and store the result in a vector. 3)…
chathux
  • 821
  • 8
  • 17
2
votes
0 answers

Python - Clustering MFCC Vectors

I am currently doing a speaker verification project using hidden markov models no accurate results on voice signals yet, though i have tested the system to various data samples (not involved with voice). I extracted the MFCC of the voice signals…
Bobby
  • 31
  • 5
2
votes
1 answer

Applying neural network to MFCCs for variable-length speech segments

I'm currently trying to create and train a neural network to perform simple speech classification using MFCCs. At the moment, I'm using 26 coefficients for each sample, and a total of 5 different classes - these are five different words with varying…
SJunejo
  • 71
  • 1
  • 8
2
votes
0 answers

Speech Recognition Using MFCC to rectify pronunciation

I am building a speech recognition application for iOS in objective C/C++ for rectifying the pronunciation of the speaker. I am using Mel-Frequency-Cepstrum Coefficients and Matching the two Sound-Waves using DTW. Please correct me if I am wrong.…
2
votes
0 answers

Are MFCC files generated from MATLAB and SPhinx4 different?

I converted a .wav file into an .mfc file using MATLAb. I found two MATLAB codes to do the same.…