I've been building a singularity container to run some python code, and despite reading the singularity docs, I can't understand the errors/behavior.
Firstly the container is Ubuntu18.04 bootstrapped from docker, ie, :
Bootstrap: docker
From: ubuntu:18.04
I need to make use of a python module (neuron), which needs to be compiled etc beforehand. I compile the code in the %post
section of the definition file and added the environment variables:
echo 'export PATH=$PATH:/usr/local/nrn/x86_64/bin' >>$SINGULARITY_ENVIRONMENT
echo 'export LD_LIBRARY_PATH=/usr/local/nrn/x86_64/lib:$LD_LIBRARY_PATH' >>$SINGULARITY_ENVIRONMENT
I can build the container without too many issues(using sudo singularity build --sandbox
). But I've been trying to run a test script (test.py) to make sure everything works as expected. In the script I import the module in question (neuron) and then I just try to save a list to a csv to make sure I could save data properly. So it looks something like this:
import neuron #this fails and gives an unusual error in specific circumstances I don't understand (described below)
import numpy as np
some_data = [1,2,3]
np.savetxt('test_results.csv',np.asarray(some_data),delimiter=',')
Depending on the flags I provide when using singularity exec
I get different results, which I don't understand (or know where to start understanding - is this a neuron, singularity or ubuntu
issue?).
For completeness, the container (and test.py) is inside the same directory I'm running these commands from (so dir in my example). So if I mount $HOME, by not using the --no-home
flag and try to run test.py like this:
singularity exec --writable --bind /home/bidby/path/to/some/dir:/mnt my_container.simg python3 /mnt/test.py
I get an error like this: dlopen failed - x86_64/.libs/libnrnmech.so: undefined symbol: celsius
which I've tried googling a fair bit, and might be a c++ linking error (but I only really know python, so debugging this hasn't been easy).
However, if I use the --no-home
flag, ie,:
singularity exec --no-home --writable --bind /home/bidby/path/to/some/dir:/mnt my_container.simg python3 /mnt/test.py
then the module imports successfully and a new error arises:
Traceback (most recent call last):
File "/mnt/test.py", line 15, in <module>
np.savetxt('test_results.csv',np.asarray(some_data),delimiter=',')
File "/usr/local/lib/python3.6/dist-packages/numpy/lib/npyio.py", line 1352, in savetxt
open(fname, 'wt').close()
PermissionError: [Errno 13] Permission denied: 'test_results.csv'
I've been googling this continuously for several days now, but I can't figure out what the problem is. From what I've learnt and tested, I figure it might be something to do with how environment variables are passed into the container, although why I don't have permission to save here is beyond me. But I feel this might be resolved if I can understand why using the --no-home
flag affects the module import.
This may not be helpful to solving the problem but other things I've noticed/tried:
If I use the --containall
flag, I can run test.py with no problem, but then the csv file I try to save can never be found. I checked the docs which say:
Using the --containall (or -C for short) flag, $HOME is not mounted and a dummy bind mount is created at the $HOME point. You cannot use -B` (or --bind) to bind your $HOME directory because it creates an empty mount. So if you have files located in the image at /home/user, the --containall flag will hide them all.
and I presume this "dummy bind mount" is where the file is being written to, hence why I can never actually find it.
If I shell into the container, with sudo
and the --writable
flag, I can import neuron without any problem. If I don't use either of those flags, then I get the same "undefined symbol" error from above.
If I don't export the LD_LIBRARY_PATH then I get a different dlopen error referring to a different .so file, saying that the file doesn't exist - this reaffirms my thinking that it's a path problem.
I know I haven't included enough code to reproduce this error, since I'm guessing no one has the time/energy to build this container (since it's fairly large) but I think I've included the most relevant parts. Will be happy to add more if needed though.
Debugging this has been a nightmare for me, and if anyone can point me in the right direction of what I should be googling I would appreciate it a lot.