20

The idiom for finding a result within a mapping of a collection goes something like this:

list.view.map(f).find(p)

where list is a List[A], f is an A => B, and p is a B => Boolean.

Is it possible to use view with parallel collections? I ask because I'm getting some very odd results:

Welcome to Scala version 2.9.1.final (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0).
Type in expressions to have them evaluated.
Type :help for more information.

scala> val f : Int => Int = i => {println(i); i + 10}
f: Int => Int = <function1>

scala> val list = (1 to 10).toList
list: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

scala> list.par.view.map(f).find(_ > 5)
1
res0: Option[Int] = Some(11)

scala> list.par.view.map(f).find(_ > 5)
res1: Option[Int] = None
Luigi Plinge
  • 50,650
  • 20
  • 113
  • 180
  • 1
    Uh, don't know exactly what's going on, but it's the `view` that's producing the odd behavior. It does the same under 2.8, which doesn't even have `par`. – Michael Lorton Dec 12 '11 at 08:04
  • Looks like a bug to me, and the behavior is still present on trunk. – Daniel C. Sobral Dec 12 '11 at 13:24
  • 1
    @Malvolio That's strange; I tried it without the `.par`, and it worked as expected for me (result consistently `Some(11)`). It also worked with the `.par` and no `.view`, just not with both. – Luigi Plinge Dec 12 '11 at 14:55
  • 1
    The concepts of `view` and parallel collections are sort of...opposites. `view` allows you to consume the collection lazily, while a parallel collection is meant to be consumed, well, in parallel. How do you justify using both? – Dan Burton Dec 12 '11 at 22:08
  • 1
    @Dan I see your point, but it makes sense for methods like `find`. The existence of a `view` method in `ParSeqLike` returning a `ParSeqView` suggests that the library designers thought a `ParSeqView` is a useful thing. – Luigi Plinge Dec 12 '11 at 23:03
  • Here I get the same behavior as @LuigiPlinge: problems only appear when `par` is used. The correct output is returned the first time, `None` all the other times. – Blaisorblade Feb 21 '12 at 08:46

1 Answers1

1

See "A Generic Parallel Collection Framework", the paper by Martin Odersky et al that discusses the new parallel collections. Page 8 has a section "Parallel Views" that talks about how view and par can be used together, and how this can give the performance benefits of both views and parallel computation.

As for your specific example, that is definately a bug. The exists method also breaks, and having it break on one list breaks it for all other lists, so I think it is a problem where operations that may be aborted part way through (find and exists can stop once the have an answer) manage to break the thread pool in some way. It could be related to the bug with exceptions being thrown inside functions passed to parallel collections. If so, it should be fixed in 2.10.

wingedsubmariner
  • 13,350
  • 1
  • 27
  • 52
  • Thanks for the link to that paper. The bug you link to doesn't sound that much like this one, although if it is the cause then someone with the latest 2.10 build should be able to verify. To make sure, I opened a new one just now: https://issues.scala-lang.org/browse/SI-5525 – Luigi Plinge Feb 27 '12 at 00:41