Highest Voted 'sparklyr' Questions

0

votes

0 answers

Remove words with a length of maximum 2. Spark

I want to remove (or replace with a non-blank value) all words of a length less than 2 in sparklyr. My attempt is below, but doesn't work: Tab8b <- tab8 %>% Ft_sql_transformer( sql="select * , Regexp_replace(VAR,…

replace hive sparklyr

asked Aug 29 '17 at 20:28

Camel

11
5

0

votes

1 answer

creating a pie chart using genderizer package in Sparklyr, R

Hi I am trying to create a pie chart in R using genderizer package. I am referring below code from site https://www.r-bloggers.com/the-gender-of-big-data/: library(rvest) library(stringr) library(dplyr) library(genderizeR) library(ggplot2) …

r apache-spark rstudio sparklyr

asked Aug 08 '17 at 03:19

RJ_Programmer

31
1
6

0

votes

2 answers

SparklyR removing a tbl from Spark Context

Similar to: SparklyR removing a Table from Spark Context, but different because: The above question asks how to remove a "table" from spark, here created by the copy_to function. If the spark_read_csv() function is used instead it appears that there…

r apache-spark rstudio sparklyr

asked Aug 07 '17 at 12:58

DaveRGP

1,430
15
34

0

votes

0 answers

Connecting to Spark from R using username password

We have a requirement where in we plan to use sparklyr to execute model code written in R over spark. The spark cluster we use is a kerborised cluster. We are able to connect to this cluster and execute our code using a keytab. The challenge we…

r hadoop apache-spark sparkr sparklyr

asked Jul 18 '17 at 17:24

Asish Balakrishnan

41
3

0

votes

0 answers

Looking for a way to: R Studio accessing files on windows AWS server

I have installed R studio on my local laptop and trying to access files located in AWS server (Windows). I do not want to use FTP protocol. What are other possible ways to remotely access the files located on a remote server? How to use SCP/SSH…

r rstudio sparklyr

asked Jul 15 '17 at 08:14

SC_kumar

21
5

0

votes

1 answer

dplyr to replace all variable which matches specific string

Is there an equivalent dplyr which does this? I'm after 'replace all' which matches string xxx with NA is.na(df) <- df=="xxx" I want to execute a sparklyr command using the pipe function from R to Spark dataframe tbl(sc,"df") %>% and sticking the…

r apache-spark dplyr null sparklyr

asked Jul 14 '17 at 04:48

Choc_waffles

518
1
4
15

0

votes

0 answers

Rsparkling memory issue

I'm running out of memory when I try to fit a random forest model on my dataset (5888 bytes) using the rsparkling random forest function with the following: h2o.randomForest(x = x, y = y, training_frame =…

r apache-spark h2o sparklyr

asked Jul 13 '17 at 17:15

mike

35
6

0

votes

0 answers

sparklyr help: spark_read_csv returns an error

I have a 3GB csv file called accelerometer.csv on my cpu. I wanted to read it into Spark using R and the sparklyr package just as an experiment before importing seriously big data (180 GB). I used this code here: spark_c <- spark_connect(master =…

r apache-spark sparklyr

asked Jul 10 '17 at 23:03

user7426583

0

votes

0 answers

Error in connecting with Spark using spark_connect command in 'sparklyr': (R-3.4.0)

I have Spark 1.6.2 installed on my system. Also I am using R(3.4.0) with rstudio-server 1.0.143 in CentOS 6.9 machine. Whenever I am running the command, sc <- spark_connect(master = "local") it shows an error message stating that: Error in…

r hadoop apache-spark sparkr sparklyr

asked Jun 30 '17 at 11:43

Pulkit Joshi

9
1

0

votes

0 answers

Sparklyr: how to improve reading speed for JSON files?

I am (trying) to load about 40 large json files (150 - 200GB each on average) into Spark using sparklyr. Some of the files would fit entirely in the RAM of a cluster, some of them would be too big. Unfortunately, the command…

json r apache-spark sparklyr

asked Jun 29 '17 at 10:52

ℕʘʘḆḽḘ

18,566
34
128
235

0

votes

0 answers

Just learning sparklyr - copy_to() error

I know this is a very simple question, and I assume it has been asked before but I have been unable to find it. I would like to learn sparklyr. However, I wrote devtools::install_github("rstudio/sparklyr") install.packages(c("nycflights13",…

r shiny rstudio sparklyr

asked Jun 24 '17 at 00:41

madhatter5

129
2
15

0

votes

1 answer

How to export sparklyr (Spark ML) models to PMML?

I know that Spark ML pipelines can be exported to PMML using the JPMML-SparkML library. I am just struggling to find out how I could do it from R using sparklyr. I am aware of open github issue, where two ideas were raised: using Scala API,…

r scala apache-spark sparklyr

asked Jun 22 '17 at 14:16

michalrudko

1,432
2
16
30

0

votes

1 answer

Hive: how to convert millisecond timestamps?

I am trying to use the HIVE UDFs (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions) from Sparklyr to read-in properly some timestamps. Unfortunately, I have not been able to parse correctly the…

hadoop hive sparklyr

asked Jun 20 '17 at 13:15

ℕʘʘḆḽḘ

18,566
34
128
235

0

votes

3 answers

is it possible to connect to mongodb from SparklyR

i can connect to MongoDB from SparkR (i am using R Studio, Spark 2.x.x, Mongo connector v2.0) as described here https://docs.mongodb.com/spark-connector/current/r-api/. I would like to do the same using SparklyR, is that possible? Could not find any…

mongodb sparklyr

asked Jun 10 '17 at 23:17

Amit Arora

169
3
15

0

votes

0 answers

Shiny and Spark: where to run spark_connect?

Following my how to free Spark resources? post, does it matter where you place the (sparklyr) spark_connect in the server.R : within or outside the shinyServer(function(input, output, session) ?

r apache-spark sparklyr

asked Jun 06 '17 at 12:35

guzu92

737
1
12
28

Questions tagged [sparklyr]