I want to remove (or replace with a non-blank value) all words of a length less than 2 in sparklyr.
My attempt is below, but doesn't work:
Tab8b <- tab8 %>% Ft_sql_transformer(
sql="select * ,
Regexp_replace(VAR,…
Hi I am trying to create a pie chart in R using genderizer package.
I am referring below code from site https://www.r-bloggers.com/the-gender-of-big-data/:
library(rvest)
library(stringr)
library(dplyr)
library(genderizeR)
library(ggplot2)
…
Similar to: SparklyR removing a Table from Spark Context, but different because:
The above question asks how to remove a "table" from spark, here created by the copy_to function. If the spark_read_csv() function is used instead it appears that there…
We have a requirement where in we plan to use sparklyr to execute model code written in R over spark. The spark cluster we use is a kerborised cluster. We are able to connect to this cluster and execute our code using a keytab. The challenge we…
I have installed R studio on my local laptop and trying to access files located in AWS server (Windows).
I do not want to use FTP protocol.
What are other possible ways to remotely access the files located on a remote server?
How to use SCP/SSH…
Is there an equivalent dplyr which does this? I'm after 'replace all' which matches string xxx with NA
is.na(df) <- df=="xxx"
I want to execute a sparklyr command using the pipe function from R to Spark dataframe
tbl(sc,"df") %>%
and sticking the…
I'm running out of memory when I try to fit a random forest model on my dataset (5888 bytes) using the rsparkling random forest function with the following:
h2o.randomForest(x = x,
y = y,
training_frame =…
I have a 3GB csv file called accelerometer.csv on my cpu. I wanted to read it into Spark using R and the sparklyr package just as an experiment before importing seriously big data (180 GB).
I used this code here:
spark_c <- spark_connect(master =…
I have Spark 1.6.2 installed on my system. Also I am using R(3.4.0) with rstudio-server 1.0.143 in CentOS 6.9 machine.
Whenever I am running the command,
sc <- spark_connect(master = "local")
it shows an error message stating that:
Error in…
I am (trying) to load about 40 large json files (150 - 200GB each on average) into Spark using sparklyr. Some of the files would fit entirely in the RAM of a cluster, some of them would be too big.
Unfortunately, the command…
I know this is a very simple question, and I assume it has been asked before but I have been unable to find it. I would like to learn sparklyr. However, I wrote
devtools::install_github("rstudio/sparklyr")
install.packages(c("nycflights13",…
I know that Spark ML pipelines can be exported to PMML using the JPMML-SparkML library. I am just struggling to find out how I could do it from R using sparklyr.
I am aware of open github issue, where two ideas were raised:
using Scala API,…
I am trying to use the HIVE UDFs (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions)
from Sparklyr to read-in properly some timestamps.
Unfortunately, I have not been able to parse correctly the…
i can connect to MongoDB from SparkR (i am using R Studio, Spark 2.x.x, Mongo connector v2.0) as described here https://docs.mongodb.com/spark-connector/current/r-api/. I would like to do the same using SparklyR, is that possible? Could not find any…
Following my how to free Spark resources? post, does it matter where you place the (sparklyr) spark_connect in the server.R : within or outside the shinyServer(function(input, output, session) ?