2

I am deploying some Databricks clusters using powershell script which takes as an input json file with pre-defined cluster templates, for example:

{
    "cluster_name": "test1",
    "max_retries": 1,
    "spark_version": "5.3.x-scala2.11",
    "timeout_seconds": 3600,
    "autotermination_minutes": 60,
    "node_type_id": "Standard_DS3_v2",
    "driver_node_type_id": "Standard_DS3_v2",
    "spark_env_vars": {
      "PYSPARK_PYTHON": "/databricks/python3/bin/python3"
    },
    "spark_conf": {
      "spark.databricks.delta.preview.enabled": "true"
    },
    "autoscale": {
      "max_workers": 4,
      "min_workers": 2
    }
  }  

However, I would like to pre-assign to them some databricks permission groups. Can I do it using such cluster template? I cannot find any property that would allow me to specify those groups.

I can go to one of my clusters that has permissions assigned manually and export it as a json. However, in this case those are also missing from the template.

Thank you in advance!

Grevioos
  • 355
  • 5
  • 30

2 Answers2

2

The workaround that follows is so infinitely hacky, I wouldn't advise anyone to resort to this, if I knew another way. The workaround is to create a web session, log in, get a CSRF token, then issue a POST request to /acl/cluster/<cluster_id> with a map from user_ids to the requested permissions. Here's an example for setting all permissions on a single cluster for a single user (or group) using Python:

import json

import requests

DB_HOST = "db-cluster"
DB_USER = "user"
DB_PASS = "pass"

def change_acl(user_id, cluster_id):
    host = DB_HOST
    username = DB_USER
    password = DB_PASS
    session = requests.Session()
    login_request = session.post("https://{}/j_security_check".format(host),
                                 data={"j_username": username, "j_password": password})
    if login_request.status_code >= 400:
        raise Exception("login failed : {}".format(login_request.content))

    config_request = session.get("https://{}/config".format(host))

    if config_request.status_code >= 400:
        raise Exception("config request failed : {}".format(config_request.content))

    config = json.loads(config_request.content)
    csrf_token = config['csrfToken']

    acl_request = session.post(
        "https://{}/acl/cluster/{}".format(host, cluster_id),
        headers={
            "X-CSRF-Token": csrf_token,
            "Content-Type": "application/x-www-form-urlencoded; charset=UTF-8"
        },
        data=json.dumps({
            "type": "set",
            'permissions': {user_id: ["*"]}
        })
    )
    if acl_request.status_code >= 400:
        raise Exception("acl request failed : {}".format(acl_request.content))

If you find a better way, please let me know. The worst thing about this is you have to log in with username and password instead of a bearer token. The second worst thing is that this may break without any notice.

I hope the developers will find the time to implement this functionality in the near future.

Midiparse
  • 4,701
  • 7
  • 28
  • 48
0

Note: You cannot specify the permissions while creating a cluster using Clusters API . You should use "Group API" or "Admin Console"

Request structure of create cluster shown as follows:

enter image description here

Privileges can be granted to users or groups that are created via the groups API and Admin Console. Each user is uniquely identified by their username (which typically maps to their email address) in Databricks. Users who are workspace administrators in Databricks belong to a special admin role and can also access objects that they haven’t been given explicit access to.

Hope this helps.


If this answers your query, do click “Mark as Answer” and "Up-Vote" for the same. And, if you have any further query do let us know.

CHEEKATLAPRADEEP
  • 12,191
  • 1
  • 19
  • 42
  • 1
    Why would someone accept this as a satisfactory answer? Your first sentence gives a negative answer. At this point, I would expect the rest of the post to hint at some kind of workaround for the problem. However here you continue with some marginally related facts, not what someone searching for an answer would be looking for. – Midiparse Sep 23 '19 at 07:47