Questions tagged [parallel-collections]

50 questions
4
votes
2 answers

What's the cost of converting a sequential collection into a parallel one, against creating it from scratch

according to the official docs there are two options to create parallel collections: 1) // There's a little bug here, doesn't matter for the sake of the question import scala.collection.parallel.mutable.ParArray val pv = new ParVector[Int] 2) val…
3
votes
2 answers

Scala parallel for running out of RAM

So for a homework assignment I am supposed to play with several threading mechanisms using a simple integration of a function that should result in pi. The implementation is supposed to handle an interval of over 500 Billion. My current…
Bbatha
  • 33
  • 3
3
votes
3 answers

Can I use Scala's parallel collections when I have several expensive operations I want to call on the same input and then collect the results?

I found a similar question but it has what seems to be a simpler case, where the expensive operation is always the same. In my case, I want to collect a set of results of some expensive API calls that I'd like to execute in parallel. Say I have: def…
pr1001
  • 21,727
  • 17
  • 79
  • 125
3
votes
1 answer

Factorial calculation using Scala actors

How to compute the factorial using Scala actors ? And would it prove more time efficient compared to for instance def factorial(n: Int): BigInt = (BigInt(1) to BigInt(n)).par.product Many Thanks.
elm
  • 20,117
  • 14
  • 67
  • 113
3
votes
1 answer

Scala parallel unordered iterator

I have an Iterable of "work units" that need to be performed, in no particular order, and can easily run in parallel without interfering with one another. Unfortunately, running too many of them at a time will exceed my available RAM, so I need to…
2
votes
3 answers

Efficiency/scalability of parallel collections in Scala (graphs)

So I've been working with parallel collections in Scala for a graph project I'm working on, I've got the basics of the graph class defined, it is currently using a scala.collection.mutable.HashMap where the key is Int and the value is…
adelbertc
  • 7,270
  • 11
  • 47
  • 70
2
votes
1 answer

Scala understanding memory usage with parallel collections

I am pretty new to Scala (loving the language) and have been dealing with reading streams/lazy lists lately. I was messing around with parallelism as I had a task that was taking very long to do synchronously with a foldLeft (but didn't need to be…
Aserian
  • 1,047
  • 1
  • 15
  • 31
2
votes
2 answers

Will calling .seq on parallel collections ensure all threads are joined?

I have a collection on which I call .par, like this: myCollection.par.map(element => longRunningOperation(element)).seq println("after map") Will calling .seq guarantee all threads are joined before continuing, and all maps completed, before…
Geo
  • 93,257
  • 117
  • 344
  • 520
2
votes
3 answers

With parallel collection, does aggregate respect order?

in scala, i have a parallel Iterable of items and i want to iterate over them and aggregate the results in some way, but in order. i'll simplify my use case and say that we start with an Iterable of integers and want to concatenate the string…
Heinrich Schmetterling
  • 6,614
  • 11
  • 40
  • 56
2
votes
0 answers

Spark UI active jobs getting stuck when using scala parallel collection

I have a dataFrame of 1000 columns, and I am trying to get some statistics by doing some operations on each column. I need to sort each column so, I can't basically do multi column operations on it. I am doing all these column operations in a…
2
votes
1 answer

Scala - Sorting par sequences

val data :Seq[Something] = ... val transformed = data.par.map transform toList val sorted = transformed.sortWith(...) How can I get rid of the toList when sorting par sequences?
User1291
  • 7,664
  • 8
  • 51
  • 108
2
votes
1 answer

How can a parallel array be reused?

I'm trying to use Scala's parallel collections to dispatch some computations in parallel. Because there's a lot of input data, I'm using mutable arrays to store data to avoid GC issues. This is the initial approach I took: // initialize the reusable…
Ben Sidhom
  • 1,548
  • 16
  • 25
2
votes
1 answer

scala parallel collections: Idiomatic way of having thread-local-variables for worker threads

The progress function below is my worker function. I need to give it access to some classes which are costly to create / acquire. Is there any standard machinery for thread-local-variables in the libraries for this ? Or will I have to write a object…
Hassan Syed
  • 20,075
  • 11
  • 87
  • 171
1
vote
3 answers

Distributing work to multiple cores: Hadoop or Scala's parallel collections?

What is the better way of making full use of multiple cores for parallel processing in a Scala/Hadoop system? Let's say I need to process 100 million documents. Documents are not very large, but processing them is computationally intensive. If I…
Adrian
  • 3,762
  • 2
  • 31
  • 40
1
vote
2 answers

ParVector map is not running in parallel

I have a bit of code like: val data = List(obj1, obj2, obj3, obj4, ...).par.map { ... } and the ParVector is roughly 12 elements large. I noticed that all of my work is being done in the main thread so I traced down the stacktrace and found that in…
Mike Axiak
  • 11,827
  • 2
  • 33
  • 49