Questions tagged [sparkr]

SparkR is an R package that provides a light-weight frontend to use Apache Spark from R.

SparkR is a package that provides a light-weight frontend to use from R.

SparkR exposes the Spark API through the RDD class and allows users to interactively run jobs from the R shell on a cluster.

SparkR exposes the RDD API of Spark as distributed lists in R.

Related Packages:

References:

796 questions
0
votes
1 answer

how to install sparkR sql in sparkR

I am new to spark. I am trying to convert my R program to SparkR for distribution purpose.when i try to initilze the sparkR sql I am getting error. sqlContext <- sparkRSQL.init(sc) Error: could not find function "sparkRSQL.init" How to install the…
arun abimaniyu
  • 167
  • 2
  • 12
0
votes
1 answer

Access data across R and Scala scripts

Is there a way to share data across R and Scala scripts assuming both scripts run against the same Spark cluster? For example, I want to pull data from a source system using Scala, and I want to access this data in R without having to persist…
jjreddick
  • 285
  • 2
  • 11
0
votes
1 answer

Sparkr convert broadcast RDD to actual value

I sent a broadcast BC to worker node, in my program BC= SparkR:::broadcast(sc, data) I have a function myF=function(x) { allV=SparkR:::value(BC) ..... Use allV ...... return(result) } Then I called this function finalResult =…
user2146141
  • 155
  • 1
  • 14
0
votes
2 answers

what is the difference between 'abs' function in R and sparkR

in sparkR API there are functions with the same name as in R. Some of the examples are abs,cosine functions. What is the difference between abs function in R and in sparkR. when does the abs function get executed in spark? documentation for sparkR…
DesirePRG
  • 6,122
  • 15
  • 69
  • 114
0
votes
1 answer

unionAll function can't run in sparkR

In SparkR I have a DataFrame data and it containd id as well. I also have a liste= 2 9 12 102 154 ... 1451 where length(liste)=3001. I want entries in data where id equals liste. In sparkR I do this newdata <- unionAll(filter(data, data$id ==…
Ole Petersen
  • 670
  • 9
  • 21
0
votes
2 answers

Is it possible to run a SparkR program in Spark without R interpreter installed?

My question is about the feasibilty of running a sparkR program in spark without an R dependency. In other words can I run the following program in spark when there is no R interpreter installed in the machine? #set env…
DesirePRG
  • 6,122
  • 15
  • 69
  • 114
0
votes
3 answers

To sort a specific column in a DataFrame in SparkR

In SparkR I have a DataFrame data. It contains time, game and id. head(data) then gives ID = 1 4 1 1 215 985 ..., game = 1 5 1 10 and time 2012-2-1, 2013-9-9, ... Now game contains a gametype which is numbers from 1 to 10. For a given gametype I…
Ole Petersen
  • 670
  • 9
  • 21
0
votes
1 answer

How to subtract elements in a DataFrame

In SparkR I have a DataFrame data contains id, amount_spent and amount_won. For example for id=1 we have head(filter(data, data$id==1)) and output is 1 30 10 1 40 100 1 22 80 1 14 2 So far I want to know if a fixed id has more won than losses.…
Ole Petersen
  • 670
  • 9
  • 21
0
votes
1 answer

Filter rows by timestamp in DataFrame of SparkR

I want to filter rows of DataFrame in SparkR by time stamp with format like the following: df <- createDataFrame(sqlContext, data.frame(ID = c(1,2,3), Timestamp=c('08/01/2014 11:18:30', …
Bamqf
  • 3,382
  • 8
  • 33
  • 47
0
votes
1 answer

SparkR documentation in detail

I want to use functions in column class of SparkR, but I can't find the detail explanation of functions like cbrt, hypot or like. Typing ?cbrt will return useless information. Anywhere I can find details of the these column functions?
Bamqf
  • 3,382
  • 8
  • 33
  • 47
0
votes
0 answers

library packages not working with oozie

Hi i am running oozie with shell script. In that shell script i am using sparkR jobs.whenever running oozie jobs i am getting error with library. here is my error. Stdoutput Running…
sharon paul
  • 93
  • 2
  • 9
0
votes
1 answer

Exception in thread "delete Spark local dirs" java.lang.NullPointerException

Hi i am running sparkr progrm through shell script. I pointed the input file to local means it is working fine,but when i point to hdfs means it throws error. Exception in thread "delete Spark local dirs" java.lang.NullPointerException Exception in…
sharon paul
  • 93
  • 2
  • 9
0
votes
1 answer

not able to run the shell script with oozie

hi i am trying to run the shell script through oozie.while running the shell script i am getting the following error. org.apache.oozie.action.hadoop.ShellMain], exit code [1] my job.properties…
sharon paul
  • 93
  • 2
  • 9
0
votes
1 answer

Attach one element from a DataFrame in sparkR

I have a DataFrame in sparkR called 'data'. 'Data' contains 'user', 'amount_spent' and 'amount_won'. I want to calculate balance= amount_spent - amount_won for user 1. y <- filter(data, data$user==1) Now I calculate the sums yn <- agg(groupBy(y,…
Ole Petersen
  • 670
  • 9
  • 21
0
votes
1 answer

How to get the sum-value without making its local

In SparkR I have a DataFrame u that contains 'amount' = 231,2,324,1213 ... To calculate the sum in sparkR I use summa <- agg(u, amount="sum") Now summa is a DataFrame. I want to know the value of summa and I can get that value by typing…
Ole Petersen
  • 670
  • 9
  • 21