Questions tagged [mfcc]

Mel-Frequency Cepstral Coefficients. The name given to an alternate representation of speech signals based on its frequency content. Very popular way to represent a speech signal as a feature vector. Used primarily for speech recognition tasks.

Mel Frequency Cepstral Coefficients (MFCC) are coefficients obtained when a speech signal is analysed by a series of filter banks with logarithmically spaced center frequencies on the Mel-scale. This choice of center frequencies is significant because it mimics the human ear. MFCC are computed from the magnitude mel-spectrogram by log-scaling, and applying the Discrete Cosine Transform to compute the cepstrum. MFCC is very popular for speech recognition tasks.

312 questions

votes

1 answer

MFCC with Java Linear and Logarithmic Filters

I am implementing MFCC algorithm with Java. There is a sample code for triangular filters and MFCC at Java. Here is the link: MFCC Java However I should follow that code written in Matlab: MFCC Matlab My question is that at Matlab code it talks…

asked Jun 02 '11 at 09:52

kamaci

72,915
69
228
366

votes

0 answers

How to fix broken data in feature extraction/pre-processing in speech recognition?

i am very new in machine learning. I stumble on this source code on github that has no database, so i decided to use my own database. This code is to recognize speaker with MFCC and GMM-UBM. But when i try to run the code, i got this error…

python speech-recognition mfcc gmm

asked May 20 '20 at 03:10

lorita

votes

1 answer

How to save arrays in a .npz structure compatible with FBK Fairseq for Direct Speech Translation?

I generated a npz folder with numpy with the code np.savez(outpath + "/data.npz", **keywords) where keywords is a dictionary structured as: "0" : array "1" : array Each array is a 2D array containing MFCC features extracted with speechpy. For…

python numpy multidimensional-array pytorch mfcc

asked May 15 '20 at 09:35

gdc

votes

1 answer

Preparing MFCC audio feature- Should all WAV files be at same length?

I would like to prepare an Audio-dataset for a machine learning model. Each .wav file should be represented as an MFCC image. While all of the images will have the same MFCC amount (= 20), the lengths of the .wav files are between 3-5…

machine-learning feature-extraction librosa mfcc

asked Mar 01 '20 at 10:19

21kc

votes

0 answers

How to correctly unpickle a file (ModuleNotFoundError)?

I saved a model using Pickle using this code below: picklefile = path.split("-")[0]+".gmm" Pickle.dump(gmm,open(dest + picklefile,'w')) print '+ modeling completed for person:',picklefile," with data point = ",list_features.shape list_features =…

python pickle mfcc gmm

asked Dec 17 '19 at 12:19

Kinjal Rathod

votes

1 answer

MFCC feature extraction, Librosa

I want to extract mfcc features of an audio file sampled at 8000 Hz with the frame size of 20 ms and of 10 ms overlap. What must be the parameters for librosa.feature.mfcc() function. Does the code written below specify 20ms chunks with 10ms…

feature-extraction mfcc librosa

asked Jul 06 '19 at 06:25

Prithvi Allurkar

votes

0 answers

What method does Librosa use to calculate Delta-MFCC?

I am trying to generate the delta-MFCCs. Apparently there are several implementations. I found the "regression" formula link here. But I don't understand why Librosa uses Savitsky-Golay filter, which is a smoothing filter. I have not found any…

audio-processing mfcc librosa

asked Jun 14 '19 at 10:41

Satashree Roy

votes

1 answer

How to make 3 dimensional array for CNN input python

I am trying to learn cnn network to recognize emotion in speech. For this I am using the mel-ceptral coefficients (mfcc) which represent each audio file as two dimensional array (number of frames * number of mfcc coefficients). I want to have a…

python arrays multidimensional-array conv-neural-network mfcc

asked May 25 '19 at 10:12

ness_cons

votes

1 answer

normalizing mel spectrogram to unit peak amplitude?

I am new to both python and librosa. I am trying to follow this method for a speech recognizer: acoustic front end My code: import librosa import librosa.display import numpy as np y, sr = librosa.load('test.wav', sr = None) normalizedy =…

python signal-processing spectrogram mfcc librosa

asked Jan 30 '19 at 02:19

sabri

votes

0 answers

GMM and MFCC for language identification

I am new to machine learning domain. Currently, I am trying to implement a audio language detection system, based on MFCC, delta, delta delta and Mel Spectrum Coefficients of any audio file. These features are extracted using librosa. Librosa…

machine-learning speech-recognition mfcc gmm

asked Nov 10 '18 at 11:31

Amit K.S

votes

2 answers

Transition between Audiosegment object and a wave file/data

I am extracting MFCC features from mp3 voice files but I do want to keep the source files unchangeable and without adding any new files. My processing includes the following steps: Load .mp3 file, eliminate silence, and generate .wav data using…

python scipy scikit-learn mfcc pydub

asked Aug 13 '18 at 15:42

SuperKogito

2,998
3
16
37

votes

1 answer

AttributeError: 'Series' object has no attribute 'label'

I'm trying to follow a tutorial on sound classification in neural networks, and I've found 3 different versions of the same tutorial, all of which work, but they all reach a snag at this point in the code, where I get the "AttributeError: 'Series'…

python neural-network classification mfcc

asked Aug 01 '18 at 13:07

ZeLobster

votes

0 answers

How to use MFCC TarsosDSP with microphone in android

in this example (answer): How to get MFCC with TarsosDSP? they show how to use MFCC in android @Test from float array, Im trying to use it with data from microphone : int sampleRate = 44100; int bufferSize = 8192; int bufferOverlap =…

android audio-processing mfcc android-thread tarsosdsp

asked Apr 24 '18 at 18:37

Atheel Massalha

votes

1 answer

generate mfcc's for audio segments based on annotated file

My main goal is in feeding mfcc features to an ANN. However I am stuck at the data pre processing step and my question has two parts. BACKGROUND : I have an audio. I have a txt file that has the annotation and time stamp like this: 0.0 2.5…

python audio mfcc librosa

asked Jan 19 '18 at 03:02

kRazzy R

1,561
1
16
44

votes

1 answer

ValueError: could not broadcast input array from shape (20,590) into shape (20)

I am trying to extract features from .wav files by using MFCC's of the sound files. I am getting an error when I try to convert my list of MFCC's to a numpy array. I am quite sure that this error is occurring because the list contains MFCC values…

python numpy machine-learning signal-processing mfcc

asked Dec 28 '17 at 03:52

Sreehari R

Prev 1 2 3

…

20 21 Next