To support decoding 'mp3' audio files, please install 'sox'

Question

I'm trying to work on an ASR model using transfer learning on wav2vec 2 model. Anyway when I ever I wan't to show or modifiy an audio file I get this problem

def prepare_dataset(batch):
    audio = batch["audio"]

    # batched output is "un-batched"
    batch["input_values"] = processor(audio["array"], sampling_rate=audio["sampling_rate"]).input_values[0]
    batch["input_length"] = len(batch["input_values"])
    
    with processor.as_target_processor():
        batch["labels"] = processor(batch["sentence"]).input_ids
    return batch

common_voice_train = common_voice_train.map(prepare_dataset, remove_columns=common_voice_train.column_names)
common_voice_test = common_voice_test.map(prepare_dataset, remove_columns=common_voice_test.column_names)

The erorrs:

RuntimeError: Backend "sox_io" is not one of available backends: ['soundfile']. ImportError: To support decoding 'mp3' audio files, please install 'sox'.

This is my pytorch and torchaudio versions:

import torch
import torchaudio

print(torch.__version__)
print(torchaudio.__version__)

1.13.1+cu117
0.13.1+cu117

I really need help fixing this problem, this is part of my junior project! )':

I've trying to installing pytorch and installing deffrent versions but nothing worked the code is working. fine in colab but it's impossible for me to train it there so I have to use visual code...

Are you on Windows or Linux? If Linux which one (Ubuntu, Fedora, etc)? Did you install python-sox with `pip install sox`? — Corralien, Jan 26 '23 at 10:17

score 0 · Answer 1 · answered Jan 26 '23 at 22:43

First, note that the second error message is not from torchaudio and it's not accurate. TorchAudio does not depend on an external sox package.

TorchAudio provides limited IO features on Windows, as libsox does not compile on Windows with VS2019. This situation is being worked on, but as of v0.13, Windows users need a workaround.

A simple way is to use other libraries like soundfile and convert the decoded NumPy NdArray object into PyTorch Tensor.

Another way is to install FFmpeg, and use torchaudio.io.StreamReader. You can write your own load function, following the tutorial like this.

https://pytorch.org/audio/0.13.1/tutorials/streamreader_basic_tutorial.html#sphx-glr-tutorials-streamreader-basic-tutorial-py

Alright, thank you for your answer! i'll try using the solution you gave me, hopefully they're not hard to do because I'm a newbie at this. I'll keep you updated. — FOXASDF, Jan 28 '23 at 09:49

To support decoding 'mp3' audio files, please install 'sox'

1 Answers1