2

I have a High Concurency cluster with Active Directory integration turned on. Runtime: Latest stable (Scala 2.11), Python: 3.

I've mounted Azure Datalake and when I want to read the data, always the first time after cluster start I get:

com.databricks.backend.daemon.data.client.adl.AzureCredentialNotFoundException: Could not find ADLS Gen1 Token

When I rerun it works fine. I read data in the following way:

df = spark.read.option("inferSchema","true").option("header","true").json(path)

Any idea what is wrong?

Thanks! Tomek

Tomek
  • 41
  • 4
  • You may refer this article, which explains about the same issue: https://kb.azuredatabricks.net/data-sources/access-adls1-from-sparklyr.html – CHEEKATLAPRADEEP Aug 06 '19 at 09:49

1 Answers1

-1

I believe you can only run the command using a high concurrency cluster. If you've attached your notebook to a standard cluster, the command won't work.