-1

I am relatively new on SparkR, and I am planning to transfer a for loop into a foreach loop in SparkR (R/3.3.3 & Spark/2.2.0).

I have searched on stackoverflow, the only relevant thread is: SparkR foreach loop

But it gives only the workaround by using other operations.

From what I see, there is a sparkr package exist (https://amplab-extras.github.io/SparkR-pkg/rdocs/1.2/index.html) and contains foreach function, but I really do not understand its use cases, which I will need some help/example from the community to help.

My example in original R code is following:

uniqueID <- unique(dataset$ID)
maxValueVector <- c()
for(id in uniqueID){
    maxValueVector <- c(maximums, max(dataset[which(dataset$ID == id), ]$value))
}

I understand that the line in for loop should be break into several lines, but is there an example I can start with, such as the example foreach code I can start with? Thanks a lot!

p.s. dataset contains 2 columns: ID and value.

windsound
  • 706
  • 4
  • 9
  • 31
  • 1
    a) That's really not how we express things in Spark. b) SparkR package you've linked have been abandoned many years ago, and even if it wasn't, its `foreach` wouldn't be applicable here. For new API see https://spark.apache.org/docs/latest/sparkr.html (Hint: focus on [`groupBy`](https://spark.apache.org/docs/latest/api/R/groupBy) docs). – zero323 Nov 22 '18 at 21:22

1 Answers1

0

As the comments said, in SparkR, we generally do not want to use foreach. In this particular case, I found the answer using sparkdataframe operatiors and solved this problem:

## IDs is collected fo
IDs <- collect(distinct(select(dataset, 'ID')))
## I added the maximums column in order to figure out the future steps
## it basically satisfied what I need to have.
Maximums <- agg(groupBy(dataset, dataset$ID), maximums = max(dataset$value))
Maximums <- arrange(Maximums, desc(Maximums$maximums))

I know since I am still new on this, so this solution may not be what you are looking for. But thanks again for comments!

windsound
  • 706
  • 4
  • 9
  • 31