2

i want to mount adls gen 2 storage accounts in azure databricks .but I am using an azure account where i don't have access to create service principal.So i am trying to mount the containers using access keys, But i keep on getting errors.

spark.conf.set("fs.azure.account.key.azadfdatalakegen2.dfs.core.windows.net",dbutils.secrets.get(scope="azdatabricks-adlsgen2SA", key="Azdatrbricks-adlsgen2-accesskeys"))
    dbutils.fs.mount(
      source = "abfss://raw@azadfdatalakegen2.dfs.core.windows.net/",
      mount_point = "/mnt/raw_adlsmnt")

i keep on getting the below error message

> --------------------------------------------------------------------------- ExecutionError                            Traceback (most recent call
> last) <command-555436758533424> in <module>
>       1 spark.conf.set("fs.azure.account.key.azadfdatalakegen2.dfs.core.windows.net",dbutils.secrets.get(scope="azdatabricks-adlsgen2SA",
> key="Azdatrbricks-adlsgen2-accesskeys"))
> ----> 2 dbutils.fs.mount(
>       3   source = "abfss://raw@azadfdatalakegen2.dfs.core.windows.net/",
>       4   mount_point = "/mnt/raw_adlsmnt")
> 
> /databricks/python_shell/dbruntime/dbutils.py in
> f_with_exception_handling(*args, **kwargs)
>     387                     exc.__context__ = None
>     388                     exc.__cause__ = None
> --> 389                     raise exc
>     390 
>     391             return f_with_exception_handling
> 
> ExecutionError: An error occurred while calling o548.mount. :
> java.lang.NullPointerException: authEndpoint  at
> shaded.databricks.v20180920_b33d810.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
>   at
> shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenUsingClientCreds(AzureADAuthenticator.java:84)
>   at
> com.databricks.backend.daemon.dbutils.DBUtilsCore.verifyAzureOAuth(DBUtilsCore.scala:803)
>   at
> com.databricks.backend.daemon.dbutils.DBUtilsCore.verifyAzureFileSystem(DBUtilsCore.scala:814)
>   at
> com.databricks.backend.daemon.dbutils.DBUtilsCore.createOrUpdateMount(DBUtilsCore.scala:734)
>   at
> com.databricks.backend.daemon.dbutils.DBUtilsCore.mount(DBUtilsCore.scala:776)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)     at
> py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)  at
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)    at
> py4j.Gateway.invoke(Gateway.java:295)     at
> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
>   at py4j.commands.CallCommand.execute(CallCommand.java:79)   at
> py4j.GatewayConnection.run(GatewayConnection.java:251)    at
> java.lang.Thread.run(Thread.java:748)

Is there any way we can mount adls gen 2 container using access keys?

2 Answers2

1

Databricks no longer recommends mounting external data locations to the Databricks Filesystem; see Mounting cloud object storage on Azure Databricks.

Best way or recommended way is set configurations on Spark to accessing ADLS Gen2 and then access storage file with URLs.

Below screenshot shows accessing ADLS gen2 with Account key. enter image description here Check below link for same. Access ADLS Gen2 storage using Account Key in Azure Databricks

Below screenshot shows accessing ADLS gen2 with SAS Token enter image description here Check below link for same. Access ADLS Gen2 or Blob Storage using a SAS token in Azure Databricks

Please note, use SAS token at storage level to work properly.

ShaikMaheer
  • 137
  • 4
0

If you want to mount Storage Account with Azure databrikcs .Follow below syntax:

dbutils.fs.mount(
    source = "wasbs://pool@vamblob.blob.core.windows.net/",
    mount_point = "/mnt/io234",
    extra_configs = {"fs.azure.account.key.vamblob.blob.core.windows.net":dbutils.secrets.get(scope = "demo_secret", key = "demo123")})

Output:

Ref1

Alternative Approach

First of all create App regestration in active directory you will get client.id,tenent id .

Ref1

Client Secret created in AAD

Ref1

Please follow below syntax for creating mount storage:

configs = {"fs.azure.account.auth.type": "OAuth",
       "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
       "fs.azure.account.oauth2.client.id": "xxxxxxxxx", 
       "fs.azure.account.oauth2.client.secret": "xxxxxxxxx", 
       "fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/xxxxxxxxx/oauth2/v2.0/token", 
       "fs.azure.createRemoteFileSystemDuringInitialization": "true"}

dbutils.fs.mount(
source = "abfss://<container_name>@<storage_account_name>.dfs.core.windows.net/<folder_name>", 
mount_point = "/mnt/<folder_name>",
extra_configs = configs)

For more information refer this article by Ron L'Esteve

B. B. Naga Sai Vamsi
  • 2,386
  • 2
  • 3
  • 11
  • Hi Vamsi. Thanks for the reply. In the first approach you have mounted a blob storage. But i want to mount datalake gen 2 storage account. if possible can you try mounting adls gen2 account with SAS or access keys .? – Saravana Kumar Aug 15 '22 at 11:05