1

enter image description hereenter image description hereIs there a way to connect RStudio that is on Azure Databricks Cluster to Delta Lake / Delta tables? (read and write mode would be awesome).

In RStudio on cluster I tried to set up the path to home directory:

- dbfs:/mnt/20_silver/
- ~dbfs:/mnt/20_silver/
- ~/mnt/20_silver/
- /mnt/20_silver/

But still didn`t succeed. Any hints?

Iskandel
  • 11
  • 2

2 Answers2

0

To make a Spark connection you can run the following code in the RStudio Ui:

SparkR::sparkR.session() /
library(sparklyr)  /
sc <- spark_connect(method = "databricks")  / 

It will work unless you have any ACLS on the file system.

Iskandel
  • 11
  • 2
  • If you have ACLS you will get the following error: @In file.create(to[okay]) : cannot create file '/usr/local/lib/R/site-library/sparklyr/java//sparklyr-2.4-2.11.jar', reason 'Permission denied' – Iskandel Nov 18 '19 at 15:42
  • Have you resolved the issue? or are you still looking for an answer. – CHEEKATLAPRADEEP Nov 19 '19 at 10:50
  • It turned out to be more complex: even if i fix the ACL, do i want all the R users to get access to all the Delta lake? If you have any experience/ ideas please share. – Iskandel Nov 21 '19 at 08:31
0

Solved!

spark_read_delta(sc, path, name = NULL, version = NULL,
timestamp = NULL, options = list(), repartition = 0,
memory = TRUE, overwrite = TRUE, ...)

https://www.rdocumentation.org/packages/sparklyr/versions/1.0.5/topics/spark_read_delta

Iskandel
  • 11
  • 2