Questions tagged [librosa]

librosa is a python package for music and audio analysis.

Following are some of the features of librosa

  • Load audio input
  • Compute mel-spectrogram, MFCC, delta features, chroma
  • Invert mel-spectrogram, MFCC or chroma back to waveform
  • Locate beat events
  • Compute beat-synchronous features
  • Display features
  • Save beat tracker output to a CSV file

For detailed information and examples, visit the librosa documentation.

See also the official Github page.

750 questions
2
votes
2 answers

Relation between hop_length, win_length, frame_length, n_fft, no.of frames

I am working with mfcc features in Python via librosa: mfccs = librosa.feature.mfcc(y=y,sr=sr,n_mfcc=12,n_fft=320,hop_length=320,htk=True) Here, I took audio signal of 1s duration which gave me len(y) = 16000, hence I took sr = 16000. I calculated…
Pranaswi Reddy
  • 71
  • 1
  • 1
  • 2
2
votes
1 answer

Librosa's inverse mel spectrogram to stft taking a long time

I am currently trying to convert a mel spectrogram back into an audio file, however, librosa's mel_to_stft function is taking a long time (upwards to 15 minutes) to read in a 30 second .wav file sampled at 384kHz. The following is my code: # Code…
Sam
  • 43
  • 6
2
votes
1 answer

Obtaining the Log Mel-spectrogram in Python

Other questions such as How to convert a mel spectrogram to log-scaled mel spectrogram have asked how to get the log-scaled mel spectrogram in python. My code below produces said spectrogram ps = librosa.feature.melspectrogram(y=y, sr=sr) ps_db=…
user10467738
  • 73
  • 1
  • 7
2
votes
1 answer

How to remove the Y-axis labels, ticks and axis label in a plot created using librosa.display.specshow

I am using this code to visualize the melspectogram and save the image spec = librosa.feature.melspectrogram(y=y,sr=sr,n_mels=128 ) plt.figure(figsize=(12, 6)) spec = librosa.amplitude_to_db(spec, ref=np.max) librosa.display.specshow(spec, sr=sr,…
2
votes
1 answer

How to resolve an invalid shape for monophonic audio

I've loaded a model for testing on Jupyter Notebook and created a path for the wav audio file I plan on testing: anger = "C:\\Desktop\\Emotion Speech Recognition\\D_10\\10ANG_XX.wav" However after I've extracted the features from another Python…
2
votes
1 answer

What is the second number in the MFCCs array?

When I extract MFCCs from an audio the ouput is (13, 22). What does the number represent? Is it time frames ? I use librosa. The code is use is: mfccs = librosa.feature.mfcc(y=X, sr=sample_rate, n_mfcc=13,…
ioan_bl
  • 35
  • 8
2
votes
1 answer

WinError 5: Access is denied when install pip numba and librosa for Python

On a Python project on Windows 10 (Python 3.8.1), I want to install the '''pip librosa''' library (for some personal project and practice for sound analysis/visualization). The first problem, after test running on the IDLE/Shell window (of Python)…
ComradeH
  • 55
  • 10
2
votes
1 answer

Difficulty performing silence removal with librosa.effects.trim command

I am trying to do a project, and in part of the project I have the user say a word which gets recorded. This word then gets the silence around it cut out, and there is a button that plays back their word without the silence. I am using librosa's…
EMC
  • 91
  • 1
  • 7
2
votes
0 answers

Librosa throws ValueError

x_val, s_rate = librosa.load(file_name, sr=sampling_rate) File "/python3.6/site-packages/librosa/core/audio.py", line 140, in load y = sf_desc.read(frames=frame_duration, dtype=dtype, always_2d=False).T File…
SAN
  • 75
  • 2
  • 11
2
votes
1 answer

Preparing MFCC audio feature- Should all WAV files be at same length?

I would like to prepare an Audio-dataset for a machine learning model. Each .wav file should be represented as an MFCC image. While all of the images will have the same MFCC amount (= 20), the lengths of the .wav files are between 3-5…
21kc
  • 23
  • 5
2
votes
1 answer

Python: time stretch wave files - comparison between three methods

I'm doing some data augmentation on a speech dataset, and I want to stretch/squeeze each audio file in the time domain. I found the following three ways to do that, but I'm not sure which one is the best or more optimized way: dimension =…
2
votes
0 answers

Error opening '*.wav': File contains data in an unknown format

I'm trying to run the following code: import os import librosa import IPython.display as ipd import matplotlib.pyplot as plt import numpy as np from scipy.io import wavfile import warnings warnings.filterwarnings("ignore") train_audio_path =…
sdo
  • 21
  • 1
  • 3
2
votes
3 answers

"No Backend Error" while reading files in Python

I am trying to perform STFT on a bunch of sound files and I get this error. The path of the files which I am trying to perform STFT is correct but still, I get this error. import librosa import io import numpy as np import tensorflow as tf import…
agastya teja
  • 629
  • 2
  • 9
  • 18
2
votes
2 answers

Why does librosa plot differ from matplotlib and audacity

I am reading pcm data from a file and then plotting it. Ive noticed that the plot varies between librosa.display.waveplot, plot and audacity. Here is the code and images %matplotlib inline import matplotlib.pyplot as plt import…
netskink
  • 4,033
  • 2
  • 34
  • 46
2
votes
1 answer

How to skip to a specific frame in a given spectrogram file

I'm encountering problems skipping ahead to a specific frame of a melspec feature set found here. The aim of getting features from the feature set is to analyse the difference in beats per second (BPS) so that i can match up the BPS of two tracks in…
plgent
  • 109
  • 2
  • 5