-4

I am trying to setup one small kubenertes cluster on my ubuntu 18.04 LTS server. Now every step is done, but checking the GPU status fails. The container keeps reporting errors:

1. Issue Description
I have done steps by Quick-Start, but when I run the test case, it reports error.

2. Steps to reproduce the issue

  • exec shell cmd

    docker run --security-opt=no-new-privileges --cap-drop=ALL --network=none -it -v /var/lib/kubelet/device-plugins:/var/lib/kubelet/device-plugins nvidia/k8s-device-plugin:1.9

  • check the erros

    2020/02/09 00:20:15 Starting to serve on /var/lib/kubelet/device-plugins/nvidia.sock
    2020/02/09 00:20:15 Could not register device plugin: rpc error: code = Unimplemented desc = unknown service deviceplugin.Registration
    2020/02/09 00:20:15 Could not contact Kubelet, retrying. Did you enable the device plugin feature gate?
    2020/02/09 00:20:15 You can check the prerequisites at: https://github.com/NVIDIA/k8s-device-plugin#prerequisites
    2020/02/09 00:20:15 You can learn how to set the runtime at: https://github.com/NVIDIA/k8s-device-plugin#quick-start

3. Environment Information
- outputs of nvidia-docker run --rm dlws/cuda nvidia-smi

NVIDIA-SMI 440.48.02 Driver Version: 440.48.02 CUDA Version: 10.2

  • outputs of nvidia-docker run --rm dlws/cuda nvidia-smi

NVIDIA-SMI 440.48.02 Driver Version: 440.48.02 CUDA Version: 10.2

  • contents of /etc/docker/daemon.json

contents:

{
"default-runtime": "nvidia",
"runtimes": {
    "nvidia": {
        "path": "nvidia-container-runtime",
        "runtimeArgs": []
    }
}

}

  • docker version: 19.03.2
  • kubernetes version: 1.15.2
Wallace
  • 561
  • 2
  • 21
  • 54
  • 1
    you ask about kubernetes but then run docker command: it doesn't make any sense. also you didn't even provided what error you get from the test case. finally you need to explain with details how did you setup your cluster. – morgwai Feb 09 '20 at 10:28
  • @morgwai reedited the post, and gave an answer I have found. – Wallace Feb 25 '20 at 03:04

1 Answers1

1

Finally I found the answer, hope this post would be helpful for others who encounter the same issue:

For kubernetes 1.15, use k8s-device-plugin:1.11 instead. The version 1.9 is not able to communicate with kubelet.

Wallace
  • 561
  • 2
  • 21
  • 54