0

i was doing some CUFFT routine in docker and faced some problem. I use the following Dockerfile.

FROM nvidia/cuda:9.1-runtime-ubuntu16.04
ENV NVIDIA_VISIBLE_DEVICES all
ENV LD_LIBRARY_PATH /usr/local/cuda-9.1/lib64/

FROM python:3.7
COPY --from=0 /usr/local/cuda-9.1 /usr/local/cuda-9.1
ENV VIRTUAL_ENV=/opt/venv
ENV PATH="/opt/venv:$PATH"
RUN pip install numpy
RUN apt update && \
    apt-get -y install gcc && \
    apt-get -y install apt-utils && \
    apt-get -y install g++ && \
    apt-get -y install pciutils && \
    apt-get -y install libc6

ADD helmsolver /helmsolver
CMD ls /usr && ls /usr/local
CMD dpkg -l | grep -i cuda
CMD cd helmsolver && bash tests.sh

To build and run docker i use such commands.

docker build -t helm .
docker run --gpus all helm

I'm able to run my code on host, but after running in docker the error 35 (cudaErrorInsufficientDriver) appears in this type of code cudaMalloc((void**)&d_array, memsize). What's wrong with my code or is it that just some .so files are missing? Here are my CUDA, docker, nvidia-smi versions

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85
Docker version 19.03.4
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67       Driver Version: 418.67       CUDA Version: N/A      |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 640      On   | 00000000:01:00.0 N/A |                  N/A |
| 40%   36C    P8    N/A /  N/A |     48MiB /  4035MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 760     On   | 00000000:02:00.0 N/A |                  N/A |
| 17%   36C    P8    N/A /  N/A |      1MiB /  4037MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

aleks
  • 77
  • 4
  • " CUDA Version: N/A " is the source of the problem. You need to fix the host driver installation – talonmies Oct 21 '19 at 14:51
  • @talonmies Is there a way to do it properly in docker? Should i follow this [guide](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#package-manager-installation)? – aleks Oct 21 '19 at 14:56
  • @talonmies Adding `NVIDIA_DRIVER_CAPABILITIES compute, utility` has solved my issue! – aleks Oct 22 '19 at 09:20
  • Please add your solution as an answer – talonmies Oct 22 '19 at 10:33

1 Answers1

3

Adding NVIDIA_DRIVER_CAPABILITIES compute, utility as ENV solves the issue.

aleks
  • 77
  • 4
  • Thank you for answering your own question! Where did you find this information? I have been suck on this (or a very similar) issue for two days and didn't find any mention of this environmental variable – mallwright Mar 16 '23 at 13:20