2

I'm trying to install some Python packages, namely tokenizers from huggingface transformers, which apparently needs Rust. So I am installing Rust on my Docker build:

FROM nikolaik/python-nodejs
USER pn
WORKDIR /home/pn/app

COPY . /home/pn/app/
RUN ls -la /home/pn/app/*
RUN curl https://sh.rustup.rs -sSf | sh -s -- -y

ENV PATH ~/.cargo/bin:$PATH

# Install Python dependencies.
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
RUN python load_model.py

# to be equal to the cores available.
CMD exec gunicorn --bind :$PORT --workers 4 --threads 4 app:app

But still, It seems like pip cannot find Rust when installing tokenizers:

Building wheels for collected packages: tokenizers
#11 103.2   Building wheel for tokenizers (pyproject.toml): started
#11 104.6   Building wheel for tokenizers (pyproject.toml): finished with status 'error'
#11 104.6   error: subprocess-exited-with-error
#11 104.6
#11 104.6   × Building wheel for tokenizers (pyproject.toml) did not run successfully.
#11 104.6   │ exit code: 1
#11 104.6   ╰─> [51 lines of output]
...
  error: can't find Rust compiler
#11 104.6
#11 104.6       If you are using an outdated pip version, it is possible a prebuilt wheel is available for this package but pip is not able to install from it. Installing from the wheel would avoid the need for a Rust compiler.

Why does this happen? How can I make sure Rust is availabe?

lte__
  • 7,175
  • 25
  • 74
  • 131
  • How about you get rid of that ~ by expanding it to a full absolute path? I would also recommend you include a test in your script to see if you can run rust manually yourself. If that works, then pip will probably work as well. – hkBst Apr 17 '22 at 06:28
  • @hkBst comment is correct. Use the full path. – Kent Bull Jan 26 '23 at 21:14

1 Answers1

4

It seems the installer adds a line to your .bashrc file that sets up the path. .bashrc is only run if you're in an interactive shell which you aren't when you're running a build script. So your path isn't set up to include the directory with the Rust compiler.

As far as I can see, the compiler is installed in $HOME/.cargo/bin. In your case, that'd be /home/pn/.cargo/bin. To add that to the path, you can add an ENV line to your Dockerfile like this

FROM nikolaik/python-nodejs
USER pn
WORKDIR /home/pn/app

COPY . /home/pn/app/
RUN ls -la /home/pn/app/*
RUN curl --proto '=https' --tlsv1.2 -sSf -y https://sh.rustup.rs | sh
ENV PATH /home/pn/.cargo/bin:$PATH

# Install Python dependencies.
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
RUN python load_model.py

# to be equal to the cores available.
CMD exec gunicorn --bind :$PORT --workers 4 --threads 4 app:app

If that doesn't work, try running a shell in the image with the command

docker run --rm -it <image name> bash

Then you can poke around and try to find the directory the rust compiler is installed in.

Hans Kilian
  • 18,948
  • 1
  • 26
  • 35
  • Apparently it's in root/.cargo, but adding that to the env still did nothing... I'm getting very frustrated with this... – lte__ Apr 16 '22 at 18:10
  • 1
    I'd think it'd be `/root/.cargo/bin` then. That's what it was for me when I tried installing it. – Hans Kilian Apr 16 '22 at 18:19
  • It's in ~/.cargo/bin both according to me bashing in and the documentation. I've updated my Dockerfile and still nothing... I'll update the question. – lte__ Apr 16 '22 at 18:40
  • even after watching the Rust installation return a "Rust successfully installed" message on the console, the tokenizers installation will just fail saying "Rust compiler not found"... I'm losing my mind... – lte__ Apr 16 '22 at 18:58
  • Honestly I think this is a pip bug/version incompatibility. With a Python 3.8 image it works. – lte__ Apr 17 '22 at 11:45
  • I hit the same problem with a `python:3.12.0a4-bullseye` image and using the `ENV PATH="${PATH}:/root/.cargo/bin` worked while using `ENV PATH="${PATH}:$HOME/.cargo/bin` did not. The $HOME variable expansion must not be happening in the Dockerfile. – Kent Bull Jan 26 '23 at 21:11