I am using this quickstart guide (https://github.com/aws-quickstart/quickstart-hail) when setting up EMR with sagemaker.
Due to security requirements, I had to enable kerberos (local KDC within EMR cluster) and I referenced this guide (https://aws.amazon.com/blogs/machine-learning/securing-data-analytics-with-an-amazon-sagemaker-notebook-instance-and-kerberized-amazon-emr-cluster/) for the Kerberos set up.
Everything was working well, except that the bokeh plots cannot be saved due to access restriction. (
I tried to run ls -la /
via the sagemaker notebook (via sparkmagic + livy), but the plots path /plots
and /var/www/html/plots
do not show and cannot be accessible.
However, when running ls -la
using ssh to the master node, I am able to see these folder paths. Changing the permissions using chmod -R 777 /var/www
didn't resolve this issue either.
Any idea whether there is a kerberos/livy setting that hides/protects certain file paths from kerberos authenticated users?