Questions tagged [librosa]

librosa is a python package for music and audio analysis.

Following are some of the features of librosa

  • Load audio input
  • Compute mel-spectrogram, MFCC, delta features, chroma
  • Invert mel-spectrogram, MFCC or chroma back to waveform
  • Locate beat events
  • Compute beat-synchronous features
  • Display features
  • Save beat tracker output to a CSV file

For detailed information and examples, visit the librosa documentation.

See also the official Github page.

750 questions
5
votes
4 answers

Librosa mel filter bank decreasing triangles

I'm a bit stuck understanding MFCCs. From what I have read the mel filter banks should be a series of triangles that get wider and their peaks are at the same place. Like this... However when I compute the mel filter banks using librosa I…
Jack Deadman
  • 166
  • 2
  • 8
4
votes
0 answers

Changing audio volume in Jupyter notebook

I'm working with audio files in a Jupyter notebook, loading/processing with Librosa and playing audio back with IPython.display. I can use np.multiply(audio_file, 0.25) to change the amplitude of the audio array, but IPython.display's Audio plays…
mojones101
  • 161
  • 1
  • 2
  • 12
4
votes
1 answer

How to log scale a 2D Matrix / Image

I have a 2D numpy array of a audio spectrogram and I want to save it as an image. I'm using librosa library to get the spectrum. And I can also plot it using librosa.display.specshow() function. There are number of different scaling types as you can…
4
votes
2 answers

install librosa on raspberry pi 4, error with the wheel of llvmlite

Im working on raspberry pi 4, with Python3 and I want to install librosa. (pip3 install librosa) Previously I installed llvm version 7.0.1 Following the Compatibility I install llvmlite https://pypi.org/project/llvmlite/ $…
Simon
  • 43
  • 1
  • 3
4
votes
1 answer

Can´t use librosa with python 3

I have installed librosa correctly with pip3 on ubuntu subsystem on windows, but when I try to execute a simple program like this one: import librosa data, sr = librosa.load('sound.mp3') print(data.shape) It is what…
4
votes
0 answers

Librosa load many MP3 memory usage

I want to load about 25K mp3 audio files in a loop and process them in a Jupyter Notebook. When loading these audio files my RAM usages keeps growing when this should not be the case. When examining the variables in RAM the audio files do not show…
Mark wijkhuizen
  • 373
  • 3
  • 10
4
votes
1 answer

I am getting "OSError: sndfile library not found" & "Unable to locate package libsndfile1" errors when deploying audio prediction model on Heroku

The objective is to deploy an audio prediction ML model on Heroku, which uses librosa library from python. The app.py file uses librosa library to extract features from the audio. When I try to deploy on Heroku, I get an error as shown…
4
votes
1 answer

How can we improve tempo detection accuracy in Librosa?

I'm using the native beat_track function from Librosa: from librosa.beat import beat_track tempo, beat_frames = beat_track(audio, sampling_rate) The original tempo of the song is at 146 BPM whereas the function approximates 73.5 BPM. While I…
Akash Sonthalia
  • 362
  • 2
  • 12
4
votes
3 answers

'Audio data must be audio data' error with google speech recognition in python

I am trying to load an audio file in python and process it with google speech recognition The problem is that unlike in C++, python doesn't show data types, classes, or give you access to memory to convert between one data type and another by…
Mich
  • 3,188
  • 4
  • 37
  • 85
4
votes
2 answers

How can I reverse a scipy.signal.spectrogram to audio with Python?

I have: import librosa from scipy import signal import scipy.io.wavfile as sf samples, sample_rate = sf.read(args.file) nperseg = int(sample_rate * 0.001 * 20) frequencies, times, spectrogram = signal.spectrogram(samples, …
Shamoon
  • 41,293
  • 91
  • 306
  • 570
4
votes
6 answers

librosa.util.exceptions.ParameterError: Invalid shape for monophonic audio: ndim=2, shape=(172972, 2)

Please somebody help me to solve this I was following this tutorial: https://data-flair.training/blogs/python-mini-project-speech-emotion-recognition/ And used their dataset which they took from the RAVDESS Dataset and lowered the sample rate of…
El_Dorado
  • 193
  • 2
  • 12
4
votes
1 answer

Measuring audio "loudness": RMS vs. LUFS

I am trying to measure the "loudness" of various clips (ranging from ~2-40 seconds) of TV content. I'm interested in the relative loudness of the content - what scenes have people shouting vs whispering, loud music vs. quiet scenes, etc. I think…
ginobimura
  • 115
  • 1
  • 5
4
votes
1 answer

How to train CNN on common voice dataset

I am trying to train a cnn with the common voice dataset. I am new to speech recognition and am not able to find any links on how to use the dataset with keras. I followed this article to build a simple word classification network. But I want to…
Sashaank
  • 880
  • 2
  • 20
  • 54
4
votes
2 answers

How can a chromagram file produced by librosa be interpreted as a set of musical keys?

I have a chroma features file here. How can these numbers be interpreted as belonging to different musical keys? I need to use the key found at a particular time code to produce a solution similar to this in order to mix between two tracks. How can…
plgent
  • 109
  • 2
  • 5
4
votes
1 answer

How to convert a female to a male voice using librosa?

how to convert man voice to women voice using librosa? I tried to convert the male voice into a female voice. I first read the wav file with librosa and then processed the audio time series with STFT,I hope that I can adjust the spectrum (increasing…
Shawn Plus
  • 457
  • 1
  • 6
  • 15