GeForce GTX 1060 card in Dell R710 or R730xd for machine learning?

Question

We are investigating speeding up some machine learning code written using Theano and Keras, in particular by getting a GPU card. Does anyone have direct experience with this or a very similar combination? Specifically, we are interested in people's experiences about:

Is it physically possible to install a card such as a GTX 1060 in a Dell R710 or R730xd?
Is anything special required to get CentOS Linux to recognize the card, other than installing the necessary Nvidia drivers?
Are there any issues with respect to power, cooling, etc., we should worry about?

A similar question has been asked, but for a different card and operating system. Discussions elsewhere such as here suggest it's possible for similar hardware, but a bit tricky. Before having our organization buy the hardware, it would be helpful to know whether there are serious issues.

Have you taken steps to answer these questions on your own yet? After all it is just a matter of looking up the specs for both pieces of hardware and driver support for CentOS (yes, Nvidia cards work on Linux). — Ryan Babchishin, Jul 30 '16 at 05:16
Yes, and in fact I've used nvidia cards before, for graphics, over a decade ago. The question has to do with both hardware and software issues. Perhaps I should rephrase the question to make it more clear I'm hoping to get people's actual practical experiences with a card of this kind in a Dell rack-mount server. — mhucka, Jul 30 '16 at 06:07
If you're using cuda, I'm pretty sure you'll need nvidia proprietary drivers and you need X running on the card. I've done it on Ubuntu... never tried CentOS. — Ryan Babchishin, Jul 30 '16 at 06:09
Thanks, that's somewhat encouraging. As it turns out, running X is not needed (it's a headless server), which should reduce hassles. — mhucka, Jul 30 '16 at 06:17
This says you do need X, but it will still be headless: https://sites.google.com/site/akohlmey/random-hacks/nvidia-gpu-coolness — Ryan Babchishin, Jul 30 '16 at 06:44
I see, now I understand your comment better. That is a very interesting find, and the kind of information I was hoping to learn. If you would like to repost it as an answer, I would be happy to upvote it. — mhucka, Jul 30 '16 at 07:31
To the people who downvoted this question, it would be helpful if you left a comment explaining why. — mhucka, Jul 30 '16 at 07:33

score 1 · Answer 1 · answered Jul 30 '16 at 08:09

You'll need the Nvidia proprietary drivers to use CUDA/OpenCL.

The card will need to be configured with X as the Nvidia drivers are X drivers, though it can still be configured to be "headless" and you can have multiple graphics cards.

Some details on running GPUs in headless servers from: https://sites.google.com/site/akohlmey/random-hacks/nvidia-gpu-coolness

Faking a "Head" for a Headless X Server The biggest remaining challenge is now to make the X server launch properly without having a display attached. Nowadays, display settings are negotiated between the X server and the display via EDID, and this is how we can simulate a display. The X server allows to override EDID settings and to define which display to configure through settings in the /etc/X11/xorg.conf file. All that is missing is a valid EDID file and this can be obtained from nvidia-settings through the "Acquire EDID" button, when examining the properties of a currently attached display (doesn't matter which one). In the xorg.conf file, something along the lines of the following has to be set.

Section "Screen"
    Identifier     "Screen0"
    Option         "UseDisplayDevice" "DFP-0"
    Option         "ConnectedMonitor" "DFP-0"
    Option         "CustomEDID" "DFP-0:/etc/X11/dfp-edid.bin"
    Option         "Coolbits" "5"
    .... End Section

I found the drivers prepackaged in ELRepo

https://elrepo.org/tiki/tiki-index.php

They can also be downloaded from Nvidia's site, but that means no auto updating.

I can't say how the server will respond to having an additional GPU in it, but you may need to mess with the bios. According to the site mentioned above about configuring it as headless, you may need to boot the server with it configured as the primary graphics adapter or at least plugin a monitor temporarily to set it up with the nvidia utilities (to generate dfp-edid.bin).

GeForce GTX 1060 card in Dell R710 or R730xd for machine learning?

1 Answers1