-3

I have a mutable scala Set:

val valueSet = scala.collection.mutable.Set[Int](0, 1, 2)

when I perform

valueSet -= 1 

the result is Set(0,2)

But when I perform same thing inside a loop or map:

Range(0, 10).map(entry => valueSet -= 1)
valueSet
res130: scala.collection.mutable.Set[Int] = Set(0, 1, 2)

The content of valueSet remain: Set(0, 1, 2).

I need to run a loop and based on some condition, remove elements from the Set until either the loop ends or Set becomes empty. I tried printing the valueSet inside the loop and it is working correctly but when the loop ends the valueSet comes back to the original set.

Using immutable version is going to seriously affect the performance of the code that's why I am using mutable version.

Please help!

EDIT: I am using spark-shell REPL. (spark 1.6.1)

I tried several things more and figured out that if I am performing loop or map on and RDD then it doesn't work. But for collections that are non distributed, It works. I am guessing this has something to do with the fact that it is a transformation function on RDD and doesn't perform any action. But that is just my guess.

Shashi K
  • 55
  • 8
  • As the name `map` implies, it maps over a structure. It is not meant to be used to do updates on things, but transform the elements of said structure. Have you actually tried the immutable version? I think this could be solved far more elegantly (and funtional-ish) by using recursion. – rethab Nov 15 '16 at 08:08
  • 2
    "The content of valueSet remain: Set(0, 1, 2).". No, it does not. I have just tried it in REPL to be sure. The content of `valueSet` is changed to `Set(0, 2)`. Perhaps what you did in map was not `valueSet -= 1`, but `valueSet - 1`? – Suma Nov 15 '16 at 08:46
  • And yes, there has to be an elegant way I am looking into it. I just wanted something dirty to start working quickly – Shashi K Nov 15 '16 at 09:18
  • If this is a Spark problem, you should perhaps post a real Spark code, and not something which is not representative of the problem you are solving. – Suma Nov 15 '16 at 18:50

2 Answers2

1

It works and removes entries according their existence

val valueSet = scala.collection.mutable.Set[Int](0, 1, 2)
  Range(0, 10).foreach(entry => valueSet -= entry)

  println(valueSet.size) //size = 0 
FaigB
  • 2,271
  • 1
  • 13
  • 22
  • Yes just after I posted this I was trying things and figured that it doesn't work only when the object we are applying map on is a distributed in nature, like RDD – Shashi K Nov 15 '16 at 09:15
1

Maybe a for comprehension - since I'm guessing your actual predicate is more involved than just removing value 1 from the set. It will return 1 new mutable Set but you will not generate an intermediary set for each value in the range.

scala> for {
     |   x <- valueSet
     |   if(x != 1)     // or whatever
     | } yield x

res1: scala.collection.mutable.Set[Int] = Set(0, 2)
jacks
  • 4,614
  • 24
  • 34