gcsfuse command fail with gcsfuse takes exactly two arguments

Question

I am using GCSFuse for mounting the GCS bucket to my user pod in JupyterHub, but it always fail with the error message gcsfuse takes exactly two arguments.

Here is my DockerFile:

FROM jupyter/minimal-notebook:177037d09156

ENV GCSFUSE_REPO gcsfuse-stretch
ENV GOOGLE_APPLICATIONS_CREDENTIALS=test-serviceaccount.json
ENV GCS_BUCKET: "my-bucket"
ENV GCS_BUCKET_FOLDER: "shared-data"

USER root

# Add google repositories for gcsfuse and google cloud sdk
RUN apt-get update -y && apt-get install -y --no-install-recommends apt-transport-https ca-certificates curl gnupg
RUN echo "deb http://packages.cloud.google.com/apt $GCSFUSE_REPO main" | tee /etc/apt/sources.list.d/gcsfuse.list
RUN echo "deb https://packages.cloud.google.com/apt cloud-sdk main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list
RUN curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -

# Install gcsfuse and google cloud sdk
RUN apt-get update -y  && apt-get install -y gcsfuse google-cloud-sdk \
    && apt-get autoremove -y \
    && apt-get clean -y \
    && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

# Switch back to notebook user (defined in the base image)
USER $NB_UID

# make directory for mounting
RUN mkdir -p home/shared-data \
    && mkdir -p etc/scripts

COPY start_mounting.sh etc/scripts

# install extra packages required for model training
RUN pip install --upgrade pip
RUN pip install fasttext
RUN pip install ax-platform

CMD ["bin/bash", "etc/scripts/start_mounting.sh"]

Script:

#!/bin/bash

# Setup GCSFuse
 gcsfuse --key-file ${GOOGLE_APPLICATIONS_CREDENTIALS} ${GCS_BUCKET} ${GCS_BUCKET_FOLDER}

my jupyterhub config.yaml

hub:
  baseUrl: /jupyterhub
  extraConfig: |
    from kubernetes import client
    def modify_pod_hook(spawner, pod):
        pod.spec.containers[0].security_context = client.V1SecurityContext(
            privileged=True,
            capabilities=client.V1Capabilities(
                add=['SYS_ADMIN']
            )
          )
        pod.spec.containers[0].env.append(
              client.V1EnvVar(
                  name='GOOGLE_APPLICATIONS_CREDENTIALS',
                  value_from=client.V1EnvVarSource(
                      secret_key_ref=client.V1SecretKeySelector(
                          name='jhub-secret',
                          key='jhub-serviceaccount',
                      )
                  )
              )
          )
        return pod
    c.KubeSpawner.modify_pod_hook = modify_pod_hook

singleuser:
  storage:
    type: none
  extraEnv:
  GCS_BUCKET: "my-bucket"
  GCS_BUCKET_FOLDER: "shared-data"
  lifecycleHooks:
    postStart:
      exec:
        command: ["/bin/sh", "etc/scripts/start_mounting.sh"]
    preStop:
      exec:
        command: ["fusermount", "-u", "shared-data"]
  image:
    name: gcr.io/project/base-images/jhub-k8s-cust-singleuser
    tag: 1.1.6
    pullPolicy: Always

I am overwriting the GOOGLE_APPLICATIONS_CREDENTIALS ENV for using it in --key-file argument in gcsfuse.

Could someone please tell me what is wrong here? Is something wrong with my pod PostStart Exec command? or my gcsfuse is wrong?

Can you add in your script, this at the line 2 `echo ${GOOGLE_APPLICATIONS_CREDENTIALS}`. I guess it's not a file name, but a JSON file content. Right? — guillaume blaquiere, Aug 08 '20 at 19:50
Oh yes I realized it is json content and not json file. Thanks for pointing that out. How can I write this to file before executing script ? — tank, Aug 08 '20 at 20:22

score 1 · Answer 1 · answered Aug 08 '20 at 20:51

I'm not an expert (and even a user) of JupyterHub. My answer is generic

I'm seeing 2 way to solve your issue

You can mount your secret file (if you have your json key in a file) into the container at runtime. However I don't know the jupyterhub syntax for achieving this
You can try this

In your jupyterhub yaml file, change the env var of your json key file content

          pod.spec.containers[0].env.append(
              client.V1EnvVar(
                  name='GOOGLE_APPLICATIONS_CREDENTIALS_CONTENT',
                  value_from=client.V1EnvVarSource(
                      secret_key_ref=client.V1SecretKeySelector(
                          name='jhub-secret',
                          key='jhub-serviceaccount',
                      )
                  )
              )
          )

Change your script like this (write the content into the defined file):

#!/bin/bash

echo ${GOOGLE_APPLICATIONS_CREDENTIALS_CONTENT} > ${GOOGLE_APPLICATIONS_CREDENTIALS}

# Setup GCSFuse
 gcsfuse --key-file ${GOOGLE_APPLICATIONS_CREDENTIALS} ${GCS_BUCKET} ${GCS_BUCKET_FOLDER}

The container is immutable. I think that will work because the change is performed only in memory.

Note: prefer an absolute path for the GOOGLE_APPLICATIONS_CREDENTIALS file path defintion

score 1 · Accepted Answer · answered Aug 11 '20 at 14:57

I solved it by creating the volume mounts for K8s secret (Google Service Account) and passing it as ENV in the script start_mounting.sh for the gcsfuse command.

Below is the code that i used:

  storage:
      extraVolumes:
        - name: my-secret-jupyterhub
          secret:
            secretName: my-secret
      extraVolumeMounts:
        - name: my-secret-jupyterhub
          mountPath: /etc/secrets
          readOnly: true
    extraEnv:
      GOOGLE_APPLICATIONS_CREDENTIALS: /etc/secrets/key.json

This seems to be rather more cleaner approach than getting the file contents of service account and again put it in file for the gcsfuse command as i was doing previously and discussed above.

gcsfuse command fail with gcsfuse takes exactly two arguments

2 Answers2