How can I define my ENV variables once in the DockerFile and pass them down to my spark image which is submitted by a supervisord managed script?

Question

I am building some Docker Spark images and I am a little puzzled on how to pass environment (ENV) variables defined in the DockerFile all the way down into the container via "run -e", on into the supervisord and and then into the spark-submit shell without having to hard-code them again in the supervisord.conf file (as seems to be the suggestion in something somewhat similar here: supervisord environment variables setting up application ).

To help explain, imagine the following components:

DockerFile (contains about 20 environment variables "ENV FOO1 bar1", etc.)
run.sh (docker run -d -e my_spark_program)
conf/supervisord.conf ([program:my_spark_program] command=sh /opt/spark/sbin/submit_my_spark_program.sh etc.)
submit_my_spark_program.sh (contains a spark-submit of the jar I want to run - probably also needs something like --files •--conf 'spark.executor.extraJavaOptions=-Dconfig.resource=app' •--conf 'spark.driver.extraJavaOptions=-Dconfig.resource=app' but this doesn't quite seem right?)

I guess I would like to define my ENV variables once in the DockerFile and only once, and I think it should be possible to pass them into the container via the run.sh using the "-e" switch, but I can't seem to figure out how to pass them from there to the supervisord and beyond into the spark-submit shell (submit_my_spark_program.sh) so that they are ultimately available to my spark-submitted jar file. This seems a little over-engineered, so maybe I am missing something here...?

I imagine you `submit_my_spark_program.sh` file should be able to use the environment variable just like a normal shell script. Have you tested it? Was it not working? BTW if `ENV` is defined in Dockerfile, you don't need to run it with `-e` which just redefines the variable. — Xiongbing Jin, Mar 16 '16 at 23:21
I tried scrubbing a smaller version of the app I have to try and duplicate the issue and I can duplicate it, so whatever I am doing wrong, I am doing it the same wrong way in both cases. I will supply code samples. The thing I am noticing also is that when I enter the container using "docker exec -it spark-simple-app bash" I can see the environment variables I want to use in the environment (using "env"), so the Dockerfile is doing it's job to get environment variables into the container, but somewhere along the line, these same variables still aren't able to reach supervisord and/or my app. — Dave McLure, Mar 19 '16 at 20:32

score 0 · Answer 1 · edited Apr 21 '16 at 21:34

Apparently the answer (or at least the workaround) in this case is to not use System.Property(name, default) to get the Docker ENV variables through the supervisor, but instead use the somewhat less useful System.getenv(name) - as this seems to work.

I was hoping to be able to use System.Property(name, default) to get the Docker ENV variables, since this offers an easy way to supply default values, but apparently this does not work in this case. If someone can improve on this answer by providing a way to use System.Property, then by all means join in. Thanks!

How can I define my ENV variables once in the DockerFile and pass them down to my spark image which is submitted by a supervisord managed script?

1 Answers1