12

Conda does a nice job at keeping control of the necessary dependencies of a package, but apparently most packages exclude the C library as a traceable dependency. For example, let's install Gnuastro with this command:

conda install -c conda-forge gnuastro

Then, I look into the libraries that one of Gnuastro's programs links with (for example astnoisechisel):

$ ldd $(which astnoisechisel)
    linux-vdso.so.1 (0x00007ffdbd336000)
    libgnuastro.so.9 => /path/to/conda/install/envs/testenv/bin/../lib/libgnuastro.so.9 (0x00007fe039ce1000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fe039b86000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe0399c6000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fe0399a5000)
    libgit2.so.28 => /path/to/conda/install/envs/testenv/bin/../lib/./libgit2.so.28 (0x00007fe039882000)
    libtiff.so.5 => /path/to/conda/install/envs/testenv/bin/../lib/./libtiff.so.5 (0x00007fe039800000)
    liblzma.so.5 => /path/to/conda/install/envs/testenv/bin/../lib/./liblzma.so.5 (0x00007fe0397d7000)
    libjpeg.so.9 => /path/to/conda/install/envs/testenv/bin/../lib/./libjpeg.so.9 (0x00007fe039799000)
    libwcs.so.5 => /path/to/conda/install/envs/testenv/bin/../lib/./libwcs.so.5 (0x00007fe03963e000)
    libcfitsio.so.8 => /path/to/conda/install/envs/testenv/bin/../lib/./libcfitsio.so.8 (0x00007fe039311000)
    libcurl.so.4 => /path/to/conda/install/envs/testenv/bin/../lib/./libcurl.so.4 (0x00007fe03928b000)
    libz.so.1 => /path/to/conda/install/envs/testenv/bin/../lib/./libz.so.1 (0x00007fe03926f000)
    libgsl.so.23 => /path/to/conda/install/envs/testenv/bin/../lib/./libgsl.so.23 (0x00007fe038fc6000)
    libcblas.so.3 => /path/to/conda/install/envs/testenv/bin/../lib/./libcblas.so.3 (0x00007fe03740d000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fe03a049000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fe037402000)
    libssl.so.1.1 => /path/to/conda/install/envs/testenv/bin/../lib/././libssl.so.1.1 (0x00007fe037372000)
    libcrypto.so.1.1 => /path/to/conda/install/envs/testenv/bin/../lib/././libcrypto.so.1.1 (0x00007fe0370c4000)
    libssh2.so.1 => /path/to/conda/install/envs/testenv/bin/../lib/././libssh2.so.1 (0x00007fe03708f000)
    libzstd.so.1 => /path/to/conda/install/envs/testenv/bin/../lib/././libzstd.so.1 (0x00007fe036fd3000)
    libbz2.so.1.0 => /path/to/conda/install/envs/testenv/bin/../lib/././libbz2.so.1.0 (0x00007fe036fbf000)
    libgssapi_krb5.so.2 => /path/to/conda/install/envs/testenv/bin/../lib/././libgssapi_krb5.so.2 (0x00007fe036f70000)
    libkrb5.so.3 => /path/to/conda/install/envs/testenv/bin/../lib/././libkrb5.so.3 (0x00007fe036e99000)
    libk5crypto.so.3 => /path/to/conda/install/envs/testenv/bin/../lib/././libk5crypto.so.3 (0x00007fe036e78000)
    libcom_err.so.3 => /path/to/conda/install/envs/testenv/bin/../lib/././libcom_err.so.3 (0x00007fe036e72000)
    libgfortran.so.4 => /path/to/conda/install/envs/testenv/bin/../lib/././libgfortran.so.4 (0x00007fe036d44000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fe036d3f000)
    libkrb5support.so.0 => /path/to/conda/install/envs/testenv/bin/../lib/./././libkrb5support.so.0 (0x00007fe036d31000)
    libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007fe036d17000)
    libquadmath.so.0 => /path/to/conda/install/envs/testenv/bin/../lib/././libquadmath.so.0 (0x00007fe036cdd000)
    libgcc_s.so.1 => /path/to/conda/install/envs/testenv/bin/../lib/././libgcc_s.so.1 (0x00007fe036cc9000)

All the higher-level libraries used are within the Conda environment, except for the C library: libm.so.6, libc.so.6, libpthread.so.0, librt.so.1, libdl.so.2, libresolv.so.2 and ofcourse ld-linux-x86-64.so.2.

I couldn't find the GNU C Library on Conda-forge, but when I done a search I found some other projects that have it. So for example I tried:

conda install -c neok.m4700 glibc

This installed GNU C Library 2.30 (Conda tarball created 3 months ago) and the ldd command above gave me a beautiful list with everything in the Conda installation. In one test Conda environment, the call to astnoisechisel --version gave a segmentation fault and in another it succeeded. Then I tried another Conda C library (in a clean environment):

conda install -c asmeurer glibc

This one is an older version of the C library (last updated 5 years ago: glibc 2.19). In this environment, my astnoisechisel --version command would only give a segmentation fault and crash.

In this conda-forge discussion, it is said that "glibc is something that is not good to ship, and if we can't use the system glibc, I'm worried about this package. It is strongly tied to kernel versions, and it's also a security risk to have old versions in use". So I guess its not their policy to include the GNU C Library (at least on GNU/Linux systems).

I also see a similar issue with "base" packages in Anaconda. For example when I check the linking flags of curl or zstd.

So my question is this: if the C library is not officially defined as a dependency (like all the other dependencies), how reliable are Conda packages (especially for older versions of software) in the not-too-distant future (like 5 years in the case above)?

On a similar note: Assume I need to manually fetch the proper C library that was used to build a Conda package (to be able to run the executable). Is the version of the C library used in the build of that package documented anywhere within the downloaded tarball?

makhlaghi
  • 3,856
  • 6
  • 27
  • 34
  • 5
    My understanding of this is that Conda packages should be built against the oldest reasonably available C-library. I believe that the `defaults` channel uses a version of CentOS with a somewhat old `glibc` for their Linux packages. The advantage of this approach is that, in general, `glibc` is backwards compatible so compiling against the older version will work on OS with newer `glibc` but not the other way around. How this affects reproducibility is unanswered, AFAIK, but I don't think anyone has a good answer for this ¯\_(ツ)_/¯ – darthbith Dec 20 '19 at 04:04
  • 1
    While libc is not shipped as a dependency, the package is still linked to it and it will find the system libc, instead of being tied by a shipped libc in a conda environment. This is confirmed by the ldd output you have shown. @darthbith's comment is the proper answer to this question. – darcamo Oct 21 '20 at 20:52
  • 1
    I should also note that this is not unique to Conda packages, Python wheels suffer from the same problem. In fact, any binary dependency that depends on `libc` will have the same problem. The buck has to stop somewhere... This is one of the big advantages of source distributions, that the user is able to compile against their local `libc`, so they know it will work on their computer. Of course, they have to have the compiler toolchain installed... – darthbith Oct 22 '20 at 01:26
  • 1
    The packages built in the conda-forge ecosystem target a specific kernel version and glibc version using a sysroot on Linux. As to how will old versions of packages continue to work on modern systems 5 years down the line, give this a read: https://developers.redhat.com/blog/2019/08/01/how-the-gnu-c-library-handles-backward-compatibility/ – Nehal J Wani May 12 '21 at 00:03

0 Answers0