Highest Voted 'nvidia' Questions - Server Fault Stack Exchange

14

votes

1 answer

What are actual Tesla M60 models used by AWS?

Wikipedia says that the Tesla M60 has 2x8 GB RAM (whatever it means) and TDP 225–300 W. I use an EC2 instance (g3s.xlarge) which is supposed to have a Tesla M60. But nvidia-smi command says it has 8GB ram and max power limit 150W: > sudo…

amazon-web-services graphics-processing-unit nvidia

asked Mar 12 '19 at 00:26

hans

242
2
8

7

votes

1 answer

Google Kubernetes Engine node pool does not autoscale from 0 nodes

I am trying to run a machine learning job on GKE, and need to use a GPU. I created a node pool with Tesla K80, as described in this walkthrough. I set the minimum node size to 0, and hoped that the autoscaler would automatically determine how many…

kubernetes google-kubernetes-engine graphics-processing-unit nvidia

asked Apr 09 '19 at 16:23

anna_hope

173
1
5

5

votes

1 answer

Why is my CUDA GPU-Util ~70% when there are "No running processes found"?

After configuring a system with 2 Tesla K80 cards, I noticed when running nvidia-smi that one of the 4 GPUs was under heavy load despite there being "No running processes found". Why is this happening and how do I correct this? Here is the output…

cuda nvidia

asked Sep 26 '16 at 18:56

Steven C. Howell

671
6
9

4

votes

0 answers

Erase GPU memory

We have Nvidia GPU cards that can be used by different users in an OpenStack environment. A first user creates a VM with access to a GPU card, then deletes the VM when done. Another user then creates a VM which is given access to the same card.…

security virtualization openstack graphics-processing-unit nvidia

asked Aug 08 '18 at 15:07

J. Chorin

41
3

4

votes

2 answers

8 GPU machine freezes

We have a SuperMicro GPU server with: 2x Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz 512GB memory more than enough disk space X10DRG-O+-CPU (BIOS Version : 2.0a [current]) X9DRG-O-PCIE PCI-E expander card 8x GTX 1080 It is setup with Ubuntu 16.04.1…

ubuntu supermicro cuda nvidia

asked Feb 08 '17 at 11:51

pks

41
3

3

votes

2 answers

NVIDIA-SMI can't communicate with NVIDIA driver

Problem description I am trying to set up a centos-7 GPU (Nvidia Tesla K80) instance on Google Cloud, to execute CUDA work. Unfortunately, I can't seem to properly install/configure drivers. Indeed, here is what happens when trying to interact with…

centos centos7 google-cloud-platform google-compute-engine nvidia

asked Dec 04 '18 at 15:36

Elouan Keryell-Even

493
2
8
21

3

votes

0 answers

The GPU usage provided by nvidia-smi command is very different from GPU metrics from guest OS

I'm working on a project that can monitor virtual machines' vgpu usage. The hypervisor is vCenter, we have nvidia A16 cards installed on vCenter hosts, and assigned a16 vGPU to a couple of windows VMs on this host, theses vGPUs are allocated to the…

virtualization vmware-vcenter nvidia task-manager gpu

asked Aug 31 '23 at 16:12

Zhuoran Bao

31
2

3

votes

1 answer

Dell PowerEdge R7525 + Nvidia A16

We have a PowerEdge R7525 server with nvidia A16 graphics card on debian 11. But we have about 50% lower gpu performance than other servers. I suspect it's the missing "Above 4G decoding" option in the BIOS. According to nvidia this server should…

debian dell-poweredge dell nvidia

asked Aug 28 '23 at 10:22

Aotor

31
1

2

votes

0 answers

"Getting devices ready" on Windows 10 while booting VM/iSCSI on another machine than initially set up

TL;DR version: virtual Windows instance reinstalls GPU drivers while switching to other hosts despite the fact it's getting the same hardware all the time. I'm trying to avoid it / shorten its time Full version: I've got an iSCSI server (Windows…

hyper-v windows-10 iscsi nvidia

asked Nov 22 '19 at 13:42

Domel

21
4

2

votes

0 answers

nvidia-smi must be run by root before it can be used by regular users

On a newly built Ubuntu 16.04 machine, running nvidia-smi fails as a regular user $ nvidia-smi NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. Running…

ubuntu ubuntu-16.04 nvidia

asked Jul 19 '19 at 02:16

hanxue

1,377
2
11
12

2

votes

1 answer

Access Denied on NVIDIA GRID 7.2 Driver

I am trying to set up an NVIDIA Tesla T4 GPU and use its RTX functionality in a raytracing application (Bakery for Unity3D). But every time I launch the app, Bakery tells me it could not find the OptiX library. I believe to have tracked it down to…

google-cloud-platform google-compute-engine nvidia grid

asked Apr 02 '19 at 12:01

omacha

63
3

2

votes

1 answer

Failed to initialize NVML: Unknown Error - Not able to complete NVIDIA Tesla P100 Grid Setup on the vSphere Host Server with Vmware ESXI 6.7

I am unable to setup the NVIDIA Tesla P100 Grid Setup on the vSphere Host Server with Vmware ESXI 6.7 on DELL EMC poweredge R740. When I am trying to run nvidia-smi command I am getting following error Failed to initialize NVML: Unknown…

vmware-esxi vmware-vsphere vmware-esx nvidia

asked Mar 08 '19 at 10:08

Sarath Zacharia

31
1
5

2

votes

0 answers

Specify a GPU to use at launch

I am currently working with an Azure GPU VM (NV6 using M60 Nvidia Graphic card) I'm doing my benchmark on this VM without any issue for the moment. Now I'm doing the same benchmark on a NV12 which has 2 GPU (or at least Windows server sees it as 2…

azure nvidia grid

asked Feb 05 '19 at 14:41

Turgal

121
1

2

votes

4 answers

Nvidia driver breaks vncserver on CentOS 7.4, is there a work around?

CentOS Linux release 7.4.1708 (Core) uname -r output: 3.10.0-693.2.2.el7.x86_64 NVidia driver: NVIDIA-Linux-x86_64-375.66.run When using the Nvidia graphics card driver with the Nvidia GeForce GT 720 graphics card on CentOS 7.4 it works fine for…

centos7 vnc nvidia

asked Oct 15 '17 at 09:23

Edward_178118

955
4
15
33

2

votes

1 answer

Installing NVIDIA Drivers for Diskless Environment

I'm trying to set up a cluster of 8 computers plus a main file server. Ideally, I'd like to set this up in a pxe-boot, quasi-diskless/quasi-stateless environment (i.e. the only local storage is /var, where things like torque configuration will go).…

centos7 nvidia

asked Jan 15 '17 at 23:20

Travis DePrato

70
1
5

Questions tagged [nvidia]