0

I have spark (3.0.1), livy (0.8.0) and Jupyterhub (sparkmagic) running on K8S in specific namespace, Kubernetes master is used as a resource manager.

When trying to create pyspark session in Jupyterhub's notebook I get the error:

22/02/04 12:09:16 WARN InteractiveSession: Failed to stop RSCDriver. Killing it... 22/02/04 12:09:18 WARN InteractiveSession: Error stopping session 2. io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://kubernetes.default.svc.cluster.local/api/v1/pods?labelSelector=spark-app-tag%2Cspark-role%3Ddriver%2Cspark-app-selector. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods is forbidden: User "system:serviceaccount:namespace:livy-acc" cannot list resource "pods" in API group "" at the cluster scope.

This error states that livy's Kubernetes client tries to list all the pods clusterwide but lacking permissions to do that.

Is it possible to restrict/limit the livy to operate in a certain namespace on Kubernetes, as the alternative of giving away cluster role is not an option due some security concerns.

Artyom Rebrov
  • 651
  • 6
  • 23
  • Are you using https://googlecloudplatform.github.io/spark-on-k8s-operator/docs/api-docs.html for your spark deployment? – tafaust May 10 '22 at 12:54
  • No, spark operator is not used. When doing spark-submit, kubernetes master url is used as --master and spark image as spark.kubernetes.container.image --conf parameter, like this: https://spark.apache.org/docs/latest/running-on-kubernetes.html#cluster-mode – Artyom Rebrov May 13 '22 at 07:29

1 Answers1

2

During code examination of livy 0.8.0 from this repo: https://github.com/jahstreet/incubator-livy.git --branch merge/first I have discovered some undocumented livy.server.kubernetes.* properties that can be used to configure how livy runs on K8S.

For K8S namespace restriction following properties can be used:

# Comma-separated list of the Kubernetes namespaces to allow for applications creation.
# All namespaces are allowed if empty
livy.server.kubernetes.allowedNamespaces = namespace

# Kubernetes client default namespace
livy.server.kubernetes.defaultNamespace = namespace
Artyom Rebrov
  • 651
  • 6
  • 23
  • 1
    For completeness, these options originate from https://github.com/apache/incubator-livy/pull/249/files#diff-716e82624363667343871aa9b12cbc8e9a14eed6665652f36a8c5f9854a5d71bR169 and the PR is/was broken down into several smaller parts. – tafaust May 10 '22 at 13:09