Questions tagged [mfcc]

Mel-Frequency Cepstral Coefficients. The name given to an alternate representation of speech signals based on its frequency content. Very popular way to represent a speech signal as a feature vector. Used primarily for speech recognition tasks.

Mel Frequency Cepstral Coefficients (MFCC) are coefficients obtained when a speech signal is analysed by a series of filter banks with logarithmically spaced center frequencies on the Mel-scale. This choice of center frequencies is significant because it mimics the human ear. MFCC are computed from the magnitude mel-spectrogram by log-scaling, and applying the Discrete Cosine Transform to compute the cepstrum. MFCC is very popular for speech recognition tasks.

312 questions
0
votes
2 answers

Mel-frequency function: error with matrix dimensions

I'm trying to make a prototype audio recognition system by following this link: http://www.ifp.illinois.edu/~minhdo/teaching/speaker_recognition/. It is quite straightforward so there is almost nothing to worry about. But my problem is with the…
Dang Manh Truong
  • 795
  • 2
  • 10
  • 35
0
votes
1 answer

Using MFCC coefficients for simple voice activity detection

Since MFCC coefficients stores information about amplitudes for bands of frequencies (that depends on used filter bank), how can those coefficient be used for voice activity detection? Would it be sufficient to use this coefficients to perform…
user3038744
  • 86
  • 1
  • 8
0
votes
1 answer

Multiple real inputs and multiple real outputs in a neural network

How can I train a perceptron where there are multiple input and output nodes and both are real-valued? I'm doing this because I want to train a neural network to predict the MFCCs given some data points (from the signal.) Here is an example data:…
fxhh
  • 47
  • 9
0
votes
0 answers

Exception occured: vector subscript out of range

I am trying to calculate mfcc and dtw of a wave using aquila DSP LIb. But when I execute the following code void main() { int frame_size =1024; Aquila::WaveFile waveIn0("a_converted.wav"); Aquila::FramesCollection…
0
votes
1 answer

how to use OpenSmile in C# and MFCC Extraction

I wanna use OpenSmile library in C# and extract MFCC feature of WAV files, but i don't know how can I use 'OpenSmile_Release.dll' Is there any one can help me?
Mahsa Parsa
  • 49
  • 1
  • 10
0
votes
2 answers

Can you still extract features from a digital signal without converting it to analog using MFCC?

I am developing a back-end speech recognition software wherein the user can import mp3 files. How can I extract the features from this digital audio file? should I convert it back to analog first?
Allen Pol
  • 51
  • 1
  • 6
0
votes
1 answer

what features are extracted or what parameters are used to distinguish a user in ASR system using MFCC?

What are the features that the MFCC extracts from speakers during testing phase? I know the method how to compute mfcc steps are: I split the signal in small frames with 10 to 30ms Apply a windowing function (humming [sic] is recommended for sound…
0
votes
1 answer

Simple word detector using MFCC

I am implementing a software for speech recognition using Mel Frequency Cepstrum Coefficients. In particular the system must recognize a single specified word. Since the audio file I get the MFCCs in a matrix with 12 rows(the MFCCs) and as many…
0
votes
1 answer

How to extract MFCC features in PocketSphinx on Android

I recently downloaded the PocketSphinx Android Demo for Android Studio. It worked on my Galaxy S5 and I'm actually surprised about the accuracy. However, I'm struggling to extract MFCC features for several reasons: There is an explanation how to…
Quantum
  • 190
  • 3
  • 16
0
votes
1 answer

How to optimize N filterbank vectors?

I have 40 triangular Mel-spaced filterbank vectors with 257 element each. I want to multiply them with my power spectrum (generated using FFT of 20ms audio frame) result and then sum the results so I can get mel-spaced power spectrum. The problem…
concept3d
  • 2,248
  • 12
  • 21
0
votes
0 answers

Ads detection using MFCC and DTW

I am doing a project to detect ads from transmission by using clipped segment (slogan) of Advertisement, based on looking at the audio track as follows: Audio Signal --> Framing --> Windowing --> FFT (Fast Fourier Transformation) --> DCT (Discrete…
0
votes
0 answers

Which sensor to use for surrounding detection?

I want to detect the surface on which my android phone is kept by vibrating the phone on the surface for few seconds(say 3) and using some machine learning features. Which sensor will the best to record data and why? I tried using the microphone to…
user3274263
  • 41
  • 1
  • 11
0
votes
1 answer

SpeechRecognition API: How to get the voice features (MEL Coefficients)

I was going to implement a speaker verification app for Android, and was wondering if there would be a way to get the voice features (MEL Coefficients) from the Android's Speech Recognition module? Please note that speaker verification is slightly…
Eb Abadi
  • 585
  • 5
  • 17
0
votes
1 answer

how extract features from mfcc coefficients

I have successfully extracted MFCC coefficients and i have got below values -15.2366 6.4996 -2.1807 0.2495 -1.3403 0.9815 -0.1106 1.7914 0.7311 1.1881 1.3340 2.6080 1.4208 2.0144 0.5085 …
0
votes
0 answers

MATLAB Neural Network Generalisation

I'm currently working on a neural network for speech recognition in MATLAB, and have extracted MFCCs for classification purposes. Currently there are 500 features for each 1-second speech clip, and there are five different classes (i.e. five…
SJunejo
  • 71
  • 1
  • 8
1 2 3
20
21