0

is there a way to redirect the notebook-dir to s3 in the JSON configuration file of an EMR-Cluster befor start the cluster. I used the following classification: "jupyter-notebook-conf" and set the following option: c.NotebookApp.notebook_dir "s3://[bucket]/path"

That creates the config file: "/etc/jupyter/jupyter_notebook_config.py" with the given entry, but with no success.

Earlier I could use the option --notebook-dir when I installed the jupyterhub manually, but now I try to use the preinstalled jupyterhub service of the EMR-Cluster (see: Run Jupyter Notebook and JupyterHub on Amazon EMR)

mad
  • 1
  • 3

1 Answers1

0

not supported on 5.16

5.17 allows this by adding this config classification

[
    {
        "Classification": "jupyter-s3-conf",
        "Properties": {
            "s3.persistence.enabled": "true",
            "s3.persistence.bucket": "MyJupyterBucket"
        }
    }
]
Steven
  • 1
  • Thank you for answer. Now, I wrote a script to get it working, but this looks much better. :D – mad Sep 11 '18 at 11:56
  • @Steven does it actually work for you? It seems like the container running jupyterhub doesn't inherit the credentials to S3. I added them manually (`export...`). Then, it turned out that `dask` and `toolz` are not installed in the container. I added them as well and still: no success. – Dror Nov 23 '18 at 07:18
  • @Dror this **does** work for me. If you believe credentials are an issue I would look at the IAM role and SGs associated with the EMR master – Steven Nov 28 '18 at 00:44