An American global technology company based in Santa Clara, California, best known for its graphics processors (GPUs).
Questions tagged [nvidia]
70 questions
0
votes
1 answer
How can I find out if my Azure VM is running on DGX-1?
I am trying to reset the GPU of my Azure virtual machine (NVIDIA GPU Cloud Image running on Standard NV6 running Ubuntu 16.04.1) to get reproducible results on a deep learning algorithm. I found this NVIDIA help page, which explains that I cannot…

miguelmorin
- 249
- 1
- 5
- 13
0
votes
1 answer
Install Nvidia Drivers 9.0 for TensorFlow pip (Debian 9.7)
I installed Nvidia drivers 9.1 on my Debian 9.7 (Dataproc)
when I try to run TensorFlow 1.9 via this test script it fails:
Used this guide to install GPU Drivers: https://cloud.google.com/dataproc/docs/concepts/compute/gpus
Used pip install…

gogasca
- 343
- 2
- 15
0
votes
0 answers
Checking GPU firmware
In a solution of GPU in Cloud (with OpenStack) where the VMs can access the graphic cards via PCI-passthrough, we want to be sure no malicious person changed the firmware of the GPU from inside a VM.
A potential solution we came up with was to use…

J. Chorin
- 41
- 3
0
votes
2 answers
"Too many levels of symbolic links" in NFS via automount resolved by restarting Docker
This is bizarre and while I have a workaround, I'd prefer a permanent fix.
I have a small group of GPU machines running Ubuntu 14.04 which I am using as workers for a cloud service that's effected via Docker images. I have nvidia-docker installed on…

krivard
- 192
- 2
- 9
0
votes
1 answer
yum install kmod-nvidia - kernel issue
Impossible to install NVIDIA driver on CentOS release CentOS Linux release 7.3.1611 (Core), the package kmod-nvidia gives errors and kernel incompatibilities.
Usually installed with yum install kmod-nvidia -y
Current output:
sudo yum install…

Kevin Lemaire
- 135
- 2
- 10
0
votes
1 answer
Reverting yum update
I needed to update NVidia driver on a CentOS 6.9 and decided to update a bit more. So I did sudo yum update and rebooted. Unfortunately that caused problems with NVidia that were worse than before. I am able to login only remotely now, and…

Michael
- 1,723
- 2
- 11
- 7
0
votes
0 answers
Can't kill an process on GPU
i have an process running on an K80 GPU. Is there an way to stop it with the nvidia tools? I tried all the kill -9 etc. Nothing is killing it.
$uname -a
Linux slurm10 3.16.0-33-generic #44~14.04.1-Ubuntu SMP Fri Mar 13 10:33:29 UTC 2015 x86_64…

PlagTag
- 253
- 1
- 3
- 9
0
votes
1 answer
CentOS 7 w/Gnome hangs on boot after Nvidia driver installation?
there is a lot of information available on these topics separately, but I haven't been able to find an answer to what I feel is a really common situation.
I have 2 Nvidia GTX 1080s in a server with CentOS 7 and Gnome desktop. The GPUs are going to…

Locane
- 429
- 1
- 8
- 20
0
votes
1 answer
kmod-nvidia-340xx.x86_64 for centos 7 kernel 3.10.0-229.20.1.el7.x86_64
I just installed centos 7 upgraded, etc.
>sudo yum update
No Packages to Update
>sudo yum upgrade
No Packages to Upgrade
I followed the instructions to install elrepo.
When I try:
>sudo yum install kmod-nvidia-340xx.x86_64
I get:
Requires: kernel…

hba
- 103
- 2
0
votes
0 answers
Will compute nodes with A100 80GB (2x on node1) and A100 40GB (2x on node2) work in Red Hat OpenShift cluster?
I think the answer should be yes, however these parts/cards are expensive, so would like to know from experts who have done this kind of things.
Will MIG be supported on this?

techele
- 1
0
votes
0 answers
Interpretation of output of nvidia-smi and lspci | grep -i nvidia
I am very new to GPU servers. I submitted a slurm job and then checked "nvidia-smi". I got the following outputs.
This picture
Then, I ran "lspci | grep -i nvidia" where I got this output.
01:00.0 VGA compatible controller: NVIDIA Corporation…
0
votes
1 answer
Problems installing Nvidia drivers for CUDA on Rocky Linux 9 - modprobe: ERROR: could not insert 'nvidia': Key was rejected by service
I've just installed the Nvidia drivers using the instructions here on our Threadripper workstations, https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#rhel-9-rocky-9
However, I'm getting this error after a reboot
modprobe: ERROR:…

James
- 101
- 14
0
votes
0 answers
MTU resetting back to 1500 after `netplan apply`
I'm trying to change the MTU of the eth0 interface on my machine running Ubuntu 18.04 (Nvidia Jetson Xavier NX). Running sudo netplan apply successfully sets the mtu for eth0 to 1280.
Unfortunately, within a minute, my SSH connection is dropped. I…

Ben Butterworth
- 562
- 5
- 12
0
votes
0 answers
NVIDIA GPU driver not functioning properly - Code 12
I had installed Windows 10 as an additional OS using bootcamp on my MacBook Pro, later I connected an eGPU to my macbook pro's thunderbolt port. I am able to see the display adapter for my NVIDIA GPU, the drivers are also installed in Windows OS ,…

Rajathithan Rajasekar
- 111
- 2
0
votes
1 answer
Do nvidia drivers need reinstalling after changing memory slots on Linux (UEFI)?
I've changed the memory slots (took out a set of 3x32Gb and put a fresh set of 3x32Gb of ECC type) on a Linux computer with Ubuntu 22.04 and upon reboot, the nvidia-smi command complained that the drivers weren't up to date or properly installed or…

719016
- 103
- 3