2

I am trying to run an OpenGL+GLUT program over SSH with X forwarding. The program provides the following errors, then seg faults.

Xlib: extension "NV-GLX" missing on display "localhost:10.0".

It seems that this is being caused because my "server" computer has an nvidia card that is then telling my client computer to use these nvidia specific rendering functions, when the client doesn't have an nvidia card. I googled this of course, and saw that many other people have had similar problems; however, the only solution I really saw that was suggested was (https://superuser.com/questions/196838/opengl-program-not-work-with-x-forwarding) to try

$ export LIBGL_ALWAYS_INDIRECT=1   or use any nonzero value

which did not work. I don't care about hardware acceleration/maintaing great performance over the ssh connection. I would just like to get the window rendering.

Community
  • 1
  • 1
Jomnipotent17
  • 451
  • 7
  • 23

1 Answers1

4

First things first, with X11 the server is the computer which produces the display output. The client is the program running on the remote computer making use of the display services of the server.

You are right insofar, that you get this message because your client (running on the remote computer) is executed on a machine with a NVidia GPU. However it's not the GPU that's making the trouble, but its drivers. One of the major drawbacks of the Linux OpenGL ABI (application binary interface) is, that the driver is also responsible for providing the system's libGL.so; if you think about it this is a rather ill conceived specification, since it actively prevents the installation of drivers for multiple GPUs of different vendors. (Windows never had this problem because of it's ICD OpenGL driver model).

Anyway, your NVidia GPU's libGL.so, when connecting to a remote X server that does not run a NVidia driver will see, that certain server extensions are not available and hence refuse to work.

So what can you do about this?

Well, you can install Mesa3D alongside the NVidia drivers; most Linux distributions have mechanisms in place (Gentoo eselect, Debian alternatives), that multiple variants of a API provider can be installed and select one as default.

With Mesa3D installed, you can use the LD_PRELOAD environment variable to preload the Mesa3D libGL.so (which will be located in some place like /usr/lib64/opengl/xorg-x11/lib/libGL.so – use your Linux distribution's package manager tools to find, where it's located; or do find /usr -iname 'libGL.so*' and choose the one, which directory does not contain nvidia) instead of the system default libGL.so.

Another viable method would be the use of lxc containers, to create a secondary system installation with Mesa3D as default OpenGL provider and when logging into the system via SSH you're dropped into such an lxc container (note that given the right configuration it's perfectly possible to make the container a mere overlay over the base system, of which breaking out into the bare system is still possible).

The Mesa3D libGL.so will happily work over a remote X session. However keep in mind that full indirect operation has been specified for up to OpenGL-2.1 but not further (i.e. for many functions of OpenGL-3 and later no GLX opcodes have been defined); many extensions, (that also made it into OpenGL-3 core) however define GLX opcodes, so if you're depending on indirect OpenGL you may have fallback to those.

Update:

Also be careful when using extensions and modern OpenGL functionality. All functions that must be loaded at runtime using glXGetProcAddress are prone to not being available at all. The segfault you're receiving indicates, that you maybe are calling a function pointer (loaded through GLEW or similar), that's simply not available and hence you're dereferencing an invalid pointer leading to the crash. Always check, that all functions and extensions you call are actually present!

datenwolf
  • 159,371
  • 13
  • 185
  • 298
  • "Windows never had this problem because of it's ICD OpenGL driver model" Then again, Window's OpenGL DLL only goes up to version 1.2, which is archaic, and you need to use something like GLEW to get any other functions. In Linux, the driver can provide a library with whatever version they want, though I'm not sure if they do in practice. – Colonel Thirty Two Apr 24 '14 at 14:20
  • @ColonelThirtyTwo: You're misinformed. In Linux the `libGL.so` is only defined for OpenGL-1.2 and you **must not** fetch pointer to higher version symbols directly from the `.so` export table. Just like on Windows all extended entry points must be queried through a GetProcAddress function, namely `glXGetProcAddress`. See the OpenGL Linux ABI specification, section 3.5 and 3.6: http://www.opengl.org/registry/ABI/ – datenwolf Apr 24 '14 at 14:47
  • Huh, didn't know that. That seems like an odd restriction. – Colonel Thirty Two Apr 24 '14 at 14:57
  • @ColonelThirtyTwo: It's not an odd restriction, it's a sensible restriction. It eliminates weird problems and error messages upon program startup or linkage. Say you'd link against a symbol `glShaderSource` that some vendor's `libGL.so` exported. Now you distribute your program and some of the target systems use a different `libGL.so` that doesn't implement those functions/extensions: A program distributed in binary would not execute, yielding a "unresolvable symbol" error at startup. A program distributed in source form would not build, because the linker can not resolve the symbol. – datenwolf Apr 24 '14 at 15:28
  • @ColonelThirtyTwo: **OpenGL is not a library!** It's an API you use to talk to hardware. Target systems' hardware configuration is not fixed and the supported OpenGL version or profile actually depends on the display used for output. Versioned libraries (sonames) just add to the problem. For something like OpenGL you want to have a strict ABI, and for everything beyond that it is insane to expose it through a static symbol table. – datenwolf Apr 24 '14 at 15:31
  • Yea, fair enough. But why expose any functions at all through the static symbol table then, especially ones that have been removed from 3.2+? I guess its probably for backwards compatibility. – Colonel Thirty Two Apr 24 '14 at 15:57
  • @ColonelThirtyTwo: A `libGL.so` that exposes functions that are not part of the Linux OpenGL ABI does this because the vendor of that API interface library simply didn't care to remove them from the export table; by default a `.so` exposes all non-`static` symbols. You have to use additional linker attributes to keep such symbols private. Here's an article on it: http://www.ibm.com/developerworks/aix/library/au-aix-symbol-visibility/index.html?ca=dat – datenwolf Apr 24 '14 at 16:30