I'm trying to work on an ASR model using transfer learning on wav2vec 2 model. Anyway when I ever I wan't to show or modifiy an audio file I get this problem
def prepare_dataset(batch):
audio = batch["audio"]
# batched output is "un-batched"
batch["input_values"] = processor(audio["array"], sampling_rate=audio["sampling_rate"]).input_values[0]
batch["input_length"] = len(batch["input_values"])
with processor.as_target_processor():
batch["labels"] = processor(batch["sentence"]).input_ids
return batch
common_voice_train = common_voice_train.map(prepare_dataset, remove_columns=common_voice_train.column_names)
common_voice_test = common_voice_test.map(prepare_dataset, remove_columns=common_voice_test.column_names)
The erorrs:
RuntimeError: Backend "sox_io" is not one of available backends: ['soundfile']. ImportError: To support decoding 'mp3' audio files, please install 'sox'.
This is my pytorch and torchaudio versions:
import torch
import torchaudio
print(torch.__version__)
print(torchaudio.__version__)
1.13.1+cu117
0.13.1+cu117
I really need help fixing this problem, this is part of my junior project! )':
I've trying to installing pytorch and installing deffrent versions but nothing worked the code is working. fine in colab but it's impossible for me to train it there so I have to use visual code...