Questions tagged [torchaudio]

56 questions
14
votes
4 answers

How can I invert a MelSpectrogram with torchaudio and get an audio waveform?

I have a MelSpectrogram generated from: eval_seq_specgram = torchaudio.transforms.MelSpectrogram(sample_rate=sample_rate, n_fft=256)(eval_audio_data).transpose(1, 2) So eval_seq_specgram now has a size of torch.Size([1, 128, 499]), where 499 is the…
Shamoon
  • 41,293
  • 91
  • 306
  • 570
3
votes
2 answers

"RunTime Error: Failed to load audio" for mp3 file (waveform, torchaudio)

No matter how I import my audio file (through uploading it on google colab, importing it through google drive), I keep getting the same error. Could it be a path issue, and if so, how could I go about fixing it? When I run an "iPython.display", it…
ihavenoidea
  • 41
  • 1
  • 3
3
votes
1 answer

How do I know which spectrogram frames belong to which audio samples?

I’ve been using this script: spgram = torchaudio.transforms.Spectrogram(512, hop_length=32) audio = spgram(audio) to get the spectrogram of some stereo music audio. I expected that the resulting spectrogram has the shape [2, 257, audio.shape[1]/32]…
halimamran
  • 41
  • 4
2
votes
2 answers

Identifying the loudest part of an audio track and cropping (Librosa or torchaudio)

I've built a U-Net model to perform audio mixing of multitrack audio, for which I've used 20s clips of the audio tracks (converted into spectrograms) as input in training the model. However the training process is incredibly long, so I think it…
Brudalaxe
  • 191
  • 1
  • 8
2
votes
1 answer

torchaudio: Error opening '_sample_data\\steam.mp3': File contains data in an unknown format

I'm new to torch audio and i'm following the this tutorial step by step. I'm having a problem loading an mp3 audio using torchaudio.info(path). Here is my code: metadata = torchaudio.info(SAMPLE_MP3_PATH) print(metadata) Here is the error that i'm…
crispengari
  • 7,901
  • 7
  • 45
  • 53
2
votes
1 answer

UserWarning: torchaudio C++ extension is not available

can someone please help me out with this UserWarning in torchaudio? ErrorMessage: C:\Users\anaconda3\lib\site-packages\torchaudio\extension\extension.py:14: UserWarning: torchaudio C++ extension is not available. warnings.warn('torchaudio C++…
user15309583
2
votes
2 answers

Pip does not recognize torchaudio libary

When i try the command: pip install torchaudio i get this error: ERROR: Could not find a version that satisfies the requirement torchaudio ERROR: No matching distribution found for torchaudio I use windows 10
Cerabbite
  • 151
  • 1
  • 2
  • 8
1
vote
0 answers

OSError: libtorch_cuda.so: cannot open shared object file: No such file or directory

enter image description here I have been stuck with this problem for a while, and I would be very grateful if someone could help me resolve it. The system I am using is Ubuntu with CUDA 12.0. As mentioned, I have tried uninstalling and reinstalling…
Ivan Wang
  • 11
  • 1
1
vote
1 answer

FFmpeg installation not detected with diart

Here I'm using the diart library for audio transcription and the OpenAI Whisper model model. When I run my code I get this error though Traceback (most recent call last): File "/home/vkyc/Desktop/projectRasa/audio/lib/python3.10/site-…
1
vote
2 answers

Diart (torchaudio) on Windows x64 results in torchaudio error "ImportError: FFmpeg libraries are not found. Please install FFmpeg."

I am giving a try to a speech diarization project named diart (based on hugging face models) I follow the instructions using a miniconda environment which are essentially: conda create -n diart python=3.8 conda activate diart conda install portaudio…
LoneWanderer
  • 3,058
  • 1
  • 23
  • 41
1
vote
1 answer

Torchaudio.save() .wav file is twice bigger than the original .wav file

I'm really new to pytorch and torchaudio. I found that the file it save is twice bigger than the original file. But I just load a .wav file and save the audio to another .wav file immediately. Why it get bigger? I've check that the bit depth(?),…
KilinWei
  • 13
  • 6
1
vote
0 answers

torchaudio.io.StreamReader doesn't throw error when seeking to time stamp more than the duration of audio file

I am trying to get the audio chunk of audio file between specific start time and end time Consider a audio of duration 10 seconds. Now i need to get chunk from 4 sec to 7 sec torchaudio.info doesn't give correct num_frames for io.BytesIO flac audio…
lokesh
  • 11
  • 3
1
vote
2 answers

Convert byte data to Pytorch tensor

I created a simple model with Pytorch to recognize bird sounds and until now I feed it .wav recordings. I want to start doing real time recognition and my question is: can I convert bytes to Pytorch tensors directly without converting it first to…
asabasdc
  • 23
  • 4
1
vote
2 answers

Cannot create .exe with pyinstaller from .py with torchaudio (CPU): AttributeError: '_OpNamespace' 'torchaudio' object has no attribute 'cuda_version'

I have a .py script that uses torchaudio (without GPU) to process some sound in Windows. To distribute it, I've used pyinstaller to turn it into a .exe. You can reproduce the issue with this simple script: import torchaudio import time if __name__…
ronkov
  • 1,263
  • 9
  • 14
1
vote
0 answers

Broadcasting error with incompatible input/output sizes (PyTorch Wave-U-Net)

I'm trying to train a Wave-U-Net for mixing multitrack audio (8 mono stems to a stereo mixture) following the methodology of this paper, whereby: Each input consist of 121,843 samples or 2.76 seconds and the output corresponds to the center part of…
Brudalaxe
  • 191
  • 1
  • 8
1
2 3 4