Questions tagged [parallel-collections]
50 questions
4
votes
2 answers
What's the cost of converting a sequential collection into a parallel one, against creating it from scratch
according to the official docs there are two options to create parallel collections:
1)
// There's a little bug here, doesn't matter for the sake of the question
import scala.collection.parallel.mutable.ParArray
val pv = new ParVector[Int]
2)
val…

santiagobasulto
- 11,320
- 11
- 64
- 88
3
votes
2 answers
Scala parallel for running out of RAM
So for a homework assignment I am supposed to play with several threading mechanisms using a simple integration of a function that should result in pi. The implementation is supposed to handle an interval of over 500 Billion. My current…

Bbatha
- 33
- 3
3
votes
3 answers
Can I use Scala's parallel collections when I have several expensive operations I want to call on the same input and then collect the results?
I found a similar question but it has what seems to be a simpler case, where the expensive operation is always the same. In my case, I want to collect a set of results of some expensive API calls that I'd like to execute in parallel.
Say I have:
def…

pr1001
- 21,727
- 17
- 79
- 125
3
votes
1 answer
Factorial calculation using Scala actors
How to compute the factorial using Scala actors ?
And would it prove more time efficient compared to for instance
def factorial(n: Int): BigInt = (BigInt(1) to BigInt(n)).par.product
Many Thanks.

elm
- 20,117
- 14
- 67
- 113
3
votes
1 answer
Scala parallel unordered iterator
I have an Iterable of "work units" that need to be performed, in no particular order, and can easily run in parallel without interfering with one another.
Unfortunately, running too many of them at a time will exceed my available RAM, so I need to…

Mysterious Dan
- 1,316
- 10
- 25
2
votes
3 answers
Efficiency/scalability of parallel collections in Scala (graphs)
So I've been working with parallel collections in Scala for a graph project I'm working on, I've got the basics of the graph class defined, it is currently using a scala.collection.mutable.HashMap where the key is Int and the value is…

adelbertc
- 7,270
- 11
- 47
- 70
2
votes
1 answer
Scala understanding memory usage with parallel collections
I am pretty new to Scala (loving the language) and have been dealing with reading streams/lazy lists lately.
I was messing around with parallelism as I had a task that was taking very long to do synchronously with a foldLeft (but didn't need to be…

Aserian
- 1,047
- 1
- 15
- 31
2
votes
2 answers
Will calling .seq on parallel collections ensure all threads are joined?
I have a collection on which I call .par, like this:
myCollection.par.map(element => longRunningOperation(element)).seq
println("after map")
Will calling .seq guarantee all threads are joined before continuing, and all maps completed, before…

Geo
- 93,257
- 117
- 344
- 520
2
votes
3 answers
With parallel collection, does aggregate respect order?
in scala, i have a parallel Iterable of items and i want to iterate over them and aggregate the results in some way, but in order. i'll simplify my use case and say that we start with an Iterable of integers and want to concatenate the string…

Heinrich Schmetterling
- 6,614
- 11
- 40
- 56
2
votes
0 answers
Spark UI active jobs getting stuck when using scala parallel collection
I have a dataFrame of 1000 columns, and I am trying to get some statistics by doing some operations on each column. I need to sort each column so, I can't basically do multi column operations on it. I am doing all these column operations in a…

Debasish
- 113
- 1
- 9
2
votes
1 answer
Scala - Sorting par sequences
val data :Seq[Something] = ...
val transformed = data.par.map transform toList
val sorted = transformed.sortWith(...)
How can I get rid of the toList when sorting par sequences?

User1291
- 7,664
- 8
- 51
- 108
2
votes
1 answer
How can a parallel array be reused?
I'm trying to use Scala's parallel collections to dispatch some computations in parallel. Because there's a lot of input data, I'm using mutable arrays to store data to avoid GC issues. This is the initial approach I took:
// initialize the reusable…

Ben Sidhom
- 1,548
- 16
- 25
2
votes
1 answer
scala parallel collections: Idiomatic way of having thread-local-variables for worker threads
The progress function below is my worker function. I need to give it access to some classes which are costly to create / acquire. Is there any standard machinery for thread-local-variables in the libraries for this ? Or will I have to write a object…

Hassan Syed
- 20,075
- 11
- 87
- 171
1
vote
3 answers
Distributing work to multiple cores: Hadoop or Scala's parallel collections?
What is the better way of making full use of multiple cores for parallel processing in a Scala/Hadoop system?
Let's say I need to process 100 million documents. Documents are not very large, but processing them is computationally intensive. If I…

Adrian
- 3,762
- 2
- 31
- 40
1
vote
2 answers
ParVector map is not running in parallel
I have a bit of code like:
val data = List(obj1, obj2, obj3, obj4, ...).par.map { ... }
and the ParVector is roughly 12 elements large. I noticed that all of my work is being done in the main thread so I traced down the stacktrace and found that in…

Mike Axiak
- 11,827
- 2
- 33
- 49