Highest Voted 'sparklyr' Questions

0

votes

3 answers

Could not parse Master URL: 'spark.bluemix.net'

I'm trying to connect to IBM's Spark as a Service running on Bluemix from RStudio running on my desktop machine. I have copied the config.yml from the automatically configured RStudio environment running on IBM's Data Science Experience: default: …

apache-spark ibm-cloud rstudio sparklyr data-science-experience

asked Mar 07 '17 at 10:03

Chris Snow

23,813
35
144
309

0

votes

1 answer

RStudio/Sparklyr on MAPR/Spark - Replace , to . in string

I'm having a Spark dataframe tbl_pred with the folowing factor column: **Value** 13,3 11 5,3 I like to convert those 'strings' to numeric values. I can use the as.numeric function, but this doesn't work because my seperator is a comma. tbl_pred…

r apache-spark sparklyr

asked Feb 10 '17 at 08:49

user3331966

152
2
9

0

votes

0 answers

Change Class of Column to Date in sparklyr Spark DataFrame

I am working with sparklyr and am having trouble changing column classes along with using dplyr to aggregate the data. This is my code currently: .libPaths(c(.libPaths(), '/usr/lib/spark/R/lib')) Sys.setenv(SPARK_HOME =…

r apache-spark-sql lubridate sparklyr

asked Feb 01 '17 at 16:18

nak5120

4,089
4
35
94

0

votes

1 answer

Can Sparklyr be used on a local machine to get around R's memory limitations?

I need to fit GLMs on data that doesn't fit into my computer's memory. Usually to get around this issue, I would sample data, fit the model and then test on a different sample that would sit out of memory. This has been R's major limitation for me…

r rstudio sparklyr

asked Jan 25 '17 at 19:28

Serban Dragne

81
8

0

votes

1 answer

Fail to connect to Spark with sparklyr

I am trying to connect to spark using sparklyr package in R and I am getting the following error: library(sparklyr) > library(dplyr) > config <- spark_config() > config[["sparklyr.shell.conf"]] <-…

r apache-spark sparklyr

asked Jan 25 '17 at 04:33

Rami Krispin

79
1
6

0

votes

1 answer

Serialize SparkR DataFrame to jobj

I'd like to be able to use the Java methods on a SparkR SparkDataFrame to write data to Cassandra. Using the sparklyr extensions for example, I can do something like this: sparklyr::invoke(sparklyr::spark_dataframe(spark_tbl), "write") %>>%…

r apache-spark cassandra sparkr sparklyr

asked Jan 22 '17 at 13:10

Akhil Nair

3,144
1
17
32

0

votes

0 answers

Dynamic mutate_each in dplyr

I have the following columns in my dataframe: c1_sum | c2_sum | d | c1 | c2 The columns c# and c#_sum are dynamic. I'm trying to do something like this for all c#: mutate(c#_weight = (d * c#) / c#_sum) The final result would be: c1_sum | c2_sum |…

r dplyr sparklyr

asked Jan 04 '17 at 19:33

Raphael Sampaio

148
2
11

0

votes

1 answer

Reading graph from file

Looking to run a GraphX example on my Windows machine using Spark-Shell from SparklyR install of Hadoop/Spark. Am able to launch the shell from the install directory here first: start…

scala apache-spark spark-graphx sparklyr

asked Jan 02 '17 at 20:55

eyeOfTheStorm

351
1
5
15

0

votes

1 answer

Looking to sort a Spark Data Frame by Index using SparklyR

library(sparklyr) library(dplyr) library(Lahman) spark_install(version = "2.0.0") sc <- spark_connect(master = "local") batting_tbl <- copy_to(sc, Lahman::Batting, "batting"); batting_tbl batting_tbl %>% arrange(-index()) # Error:…

r apache-spark dplyr apache-spark-sql sparklyr

asked Dec 12 '16 at 21:54

eyeOfTheStorm

351
1
5
15

0

votes

1 answer

Cannot load sql table to r through SparkR

I'm trying to load an SQL table in R through sparkR. I have the following code: Sys.setenv(SPARK_HOME = "C:/Users/hms/Desktop/spark-2.0.1-bin-hadoop2.7/spark-2.0.1-bin-hadoop2.7", HADOOP_HOME =…

java r apache-spark sparkr sparklyr

asked Nov 08 '16 at 17:29

hsilva

175
2
14

0

votes

1 answer

is.na and quantile with sparklyr

I am using sparklyr and it seems to be working well. However, some of my former code will not be implemented. When is use complete.cases I get Error: org.apache.spark.sql.AnalysisException: undefined function COMPLETE.CASES I get the same…

r dplyr sparklyr

asked Nov 03 '16 at 17:54

Levi Brackman

325
2
17

0

votes

1 answer

Error while connecting sparklyr to remote sparkR in Rstudio

I tried following command in my local RStudio session to connect to sparkR - sc <- spark_connect(master = "spark://x.x.x.x:7077", spark_home = "/home/hduser/spark-2.0.0-bin-hadoop2.7", version="2.0.0", config = list()) But, I am getting following…

rstudio sparklyr

asked Oct 03 '16 at 17:22

r4sn4

117
5
14

0

votes

4 answers

Connect R to Spark through sparklyr

I'm trying to connect R to Spark following the sparklyr tutorial from RStudio: http://spark.rstudio.com/ But some how, I'm getting a weird error message as below. Does anyone knows how to solve this ? I have tried to add the C:\Windows\system32 path…

r apache-spark sparklyr

asked Jul 20 '16 at 23:08

user1514373

1
1
1

-1

votes

1 answer

How to use dplyr in sparklyr

Hello I am just getting started using Sparklyr and I am getting an error when trying to use dplyr to wrangle some data. library(sparklyr) sc <- spark_connect(master = "local") spark_read_csv(sc, "df2_tbl", "C:/Users/...csv") spark_read_csv(sc,…

r apache-spark dplyr sparklyr

asked Oct 30 '18 at 05:15

Kreitz Gigs

369
1
9

-2

votes

1 answer

Running parallel function calls with sparklyr

Currently, I am using foreach loop from doparallel library to run function calls in parallel across multiple cores of the same machine, which looks something like this: out_results=foreach(i =1:length(some_list))%dopar% { …

r apache-spark parallel-processing sparkr sparklyr

asked Sep 03 '20 at 07:58

Paras Karandikar

33
4

Questions tagged [sparklyr]