2

I would like to use the spark-csv package with SparkR using RStudio. It works perfectly with the SparkR shell but I didn't find any way to include it in a RStudio session.

Any idea how to do it ?

Thanks for your help

Jaap
  • 81,064
  • 34
  • 182
  • 193
  • 1
    What do you mean by "[...] to include it in a RStudio session"? –  Jun 24 '15 at 09:21
  • Do you mean to say that you are not able to load the package in RStudio ? – psteelk Jun 24 '15 at 09:23
  • @psteelk yes I was not able to load the package on RStudio. I actually found an answer [here](http://stackoverflow.com/questions/30952039/sparkr-and-packages). The problem is that I need to build an assembly jar, which is not really convenient. – Alban Phélip Jun 24 '15 at 11:46

1 Answers1

1

I have had the same problem, look to this question

The solution given by Pragith works perfect without building the assembly jar: run

Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.0.3" "sparkr-shell"')

before

library(SparkR)

And you can read the .csv file from within RStudio. In the same way, you should be able to include all other packages you want.

Community
  • 1
  • 1
Wannes Rosiers
  • 1,680
  • 1
  • 12
  • 18