4

I'm trying to load a pretrained SpeechBrain HuggingFace model from local files; I don't want it to call out to HuggingFace to download. However, unless I change the pretrained_path in hyperparams.yaml, it is still calling out to HuggingFace and downloading the models from HF.

from speechbrain.pretrained import EncoderClassifier
model_folder = "/local/path/to/folder/with/model_files"
model = EncoderClassifier.from_hparams(source=model_folder)

To get the model files into a local directory:

  1. I downloaded the model from HuggingFace.
  2. I moved the actual model files in ~/.cache/huggingface/hub/ to model_folder path. I also renamed them to their symlinked names: embedding_model.ckpt, label_encoder.ckpt, and classifier.ckpt.
  3. I then tried changing pretrained_path in hyperparams.yaml to model_folder. But that causes the model to not load properly.

HF model: https://huggingface.co/TalTechNLP/voxlingua107-epaca-tdnn

What's the right way to load an EncoderClassifier from local files?

Nat G
  • 191
  • 1
  • 15
  • I think the question would benefit from having only `speechbrain` as tag rather than speech recogntion, hugging face, etc. These configs are specific of SpeechBrain toolkit :) – cassota Mar 03 '23 at 15:24

2 Answers2

0

SOLVED: User Error. The steps in 1-3 above work. There was a typo in the names of one of my files: it should be label_encoder.txt not label_encoder.ckpt. You can see this by looking at hyperparams.yaml for voxlingua107-epaca-tdnn to see what it expects.

Nat G
  • 191
  • 1
  • 15
0

Just adding my 10 cents as a SpeechBrain newbie, hope it helps other people.

For me the model was another ECAPA-TDNN, but for emotion recognition on IEMOCAP instead of LID on VoxLingua107. Besides, the model was on Google Drive rather than on HF hub.

First thing was to move files hyperparams.yaml and label_encoder.txt to the checkpoints dir save/CKPT+2021-07-04+12-04-23+00/, and then editing the YAML config file as follows. The bottomline secret is to link some *.ckpt files to the pretrained param, tho.

# this param should be edited
pretrained_path: /tmp/ECAPA-TDNN/1968/save/CKPT+2021-07-04+12-04-23+00

# the following must be appended
label_encoder: !new:speechbrain.dataio.encoder.CategoricalEncoder

pretrainer: !new:speechbrain.utils.parameter_transfer.Pretrainer
    loadables:
        embedding_model: !ref <embedding_model>
        classifier: !ref <classifier>
        mean_var_norm: !ref <mean_var_norm>
        label_encoder: !ref <label_encoder>
    paths:
        embedding_model: !ref <pretrained_path>/embedding_model.ckpt
        classifier: !ref <pretrained_path>/classifier.ckpt
        mean_var_norm: !ref <pretrained_path>/normalizer.ckpt
        label_encoder: !ref <pretrained_path>/label_encoder.txt

Then I wrote the Python script that performs inference on a single file.

from speechbrain.pretrained import EncoderClassifier
model_folder = "/tmp/ECAPA-TDNN/1968/save/CKPT+2021-07-04+12-04-23+00"
model = EncoderClassifier.from_hparams(source=model_folder)
x = model.classify_file("/tmp/angry.wav")
print(x)  # (tensor([[0.4100, 0.3910, 0.2401, 0.7020]]), tensor([0.7020]), tensor([3]), ['neu'])

One's also gotta make sure that the class from speechbrain.pretrained's interfaces matches the model's.

Links that helped me:

cassota
  • 125
  • 2
  • 5