Running transformers on docker

Question

Why this works in google colab but doesn't work on docker?

So this is my Dockerfile.

FROM python:3.7
RUN pip install -q transformers tensorflow 
RUN pip install ipython
ENTRYPOINT ["/bin/bash"]

And I'm executing this.

from transformers import *
nlp = pipeline(
    'question-answering', 
    model='mrm8488/distill-bert-base-spanish-wwm-cased-finetuned-spa-squad2-es',
    tokenizer=(
        'mrm8488/distill-bert-base-spanish-wwm-cased-finetuned-spa-squad2-es',  
        {"use_fast": False}
    )
)

But I get this error

   ...:                                                                                                                                                                             
Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 465/465 [00:00<00:00, 325kB/s]
Downloading: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 242k/242k [00:00<00:00, 796kB/s]
Downloading: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 112/112 [00:00<00:00, 70.1kB/s]
Downloading: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 135/135 [00:00<00:00, 99.6kB/s]
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
/usr/local/lib/python3.7/site-packages/transformers/modeling_tf_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    461                 if resolved_archive_file is None:
--> 462                     raise EnvironmentError
    463             except EnvironmentError:

OSError: 

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
<ipython-input-1-1f9fed95967a> in <module>
      5     tokenizer=(
      6         'mrm8488/distill-bert-base-spanish-wwm-cased-finetuned-spa-squad2-es',
----> 7         {"use_fast": False}
      8     )
      9 )

/usr/local/lib/python3.7/site-packages/transformers/pipelines.py in pipeline(task, model, config, tokenizer, framework, **kwargs)
   1882                 "Trying to load the model with Tensorflow."
   1883             )
-> 1884         model = model_class.from_pretrained(model, config=config, **model_kwargs)
   1885 
   1886     return task_class(model=model, tokenizer=tokenizer, modelcard=modelcard, framework=framework, task=task, **kwargs)

/usr/local/lib/python3.7/site-packages/transformers/modeling_tf_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
   1207         for config_class, model_class in TF_MODEL_FOR_QUESTION_ANSWERING_MAPPING.items():
   1208             if isinstance(config, config_class):
-> 1209                 return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, **kwargs)
   1210         raise ValueError(
   1211             "Unrecognized configuration class {} for this kind of TFAutoModel: {}.\n"

/usr/local/lib/python3.7/site-packages/transformers/modeling_tf_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    467                     f"- or '{pretrained_model_name_or_path}' is the correct path to a directory containing a file named one of {TF2_WEIGHTS_NAME}, {WEIGHTS_NAME}.\n\n"
    468                 )
--> 469                 raise EnvironmentError(msg)
    470             if resolved_archive_file == archive_file:
    471                 logger.info("loading weights file {}".format(archive_file))

OSError: Can't load weights for 'mrm8488/distill-bert-base-spanish-wwm-cased-finetuned-spa-squad2-es'. Make sure that:

- 'mrm8488/distill-bert-base-spanish-wwm-cased-finetuned-spa-squad2-es' is a correct model identifier listed on 'https://huggingface.co/models'

- or 'mrm8488/distill-bert-base-spanish-wwm-cased-finetuned-spa-squad2-es' is the correct path to a directory containing a file named one of tf_model.h5, pytorch_model.bin.

However this works perfectly in google colab. This Google Colab doesn't require GPU to be ran, so why wouldn't it work in docker? What dependencies could I be missing? It doesn't see in the error message that dependencies could be missing, more than the model is no there but look: And yes, this model exists "mrm8488/distill-bert-base-spanish-wwm-cased-finetuned-spa-squad2-es" in hugging.co

Do you have a local clone of said model in the Colab instance? Otherwise, I suspect that the download path on your docker container may differ from the standard installation location of `transformers`. See [here](https://stackoverflow.com/questions/61798573/where-does-hugging-faces-transformers-save-models) for reference of the download folder. — dennlinger, Jul 12 '20 at 19:03
@dennlinger Hmm no, the Google Colab script doesn't seem to require a google drive connection or GPU, maybe it's an enviroment variable? — Rainb, Jul 13 '20 at 04:57
I don't think it has to do anything with the GPU, and even without google Drive, my suspicion would be that Docker has the `~/.cache` directory somewhere else, where huggingface isn't looking. — dennlinger, Jul 13 '20 at 10:59

score 1 · Accepted Answer · answered Jul 25 '23 at 17:26

I believe the issue is that Docker doesn't store a cache-- it isn't persistent between runs-- and Hugging Face doesn't automatically download the model files without that cache.

You can/should mount a storage folder (like "hf_cache" or something similar) in your project directory. Then set your environment variable in the Dockerfile like so:

# Set the environment variable for the Hugging Face cache directory
ENV TRANSFORMERS_CACHE=/hf_cache

Then mount the cache directory like this:

docker run -v $(pwd)/hf_cache:/hf_cache hf_image

(Or if you're using pycharm, just use the Docker run configuration)

This tells docker to use your local cache file and lets you persist information between Docker runs, and HuggingFace can access/cache/reuse the downloaded models between runs.

I've run this locally on Docker and it works for me.

It's also good practice to include a requirements.txt file and direct Docker to download the contents: after resolving this issue, I then had to download a couple of dependencies.

Running transformers on docker

1 Answers1