Scala Futures - confused by CPU load and output of two approaches

Question

I made a mistake while implementing scala futures, or at least i think I did, and just noticed it however, when I fix the mistake it runs much slower than when I don't use futures. Can someone help me to understand what is going on?

I have a slow method that I need to run 5,000 times. Each one is independent and returns a Double. I then need to calculate the mean and standard deviation of the 5,000 returned values.

When I coded it initially I did it this way:

import actors.Futures._
import util.Random
import actors.Future

def one = {
  var results = List[Future[Double]]()
  var expectedResult: List[Double] = Nil
  var i = 0

  while (i < 1000) {
    val f = future {
      Thread.sleep(scala.util.Random.nextInt(5) * 100)
      println("Loop count: " + i)
      Random.nextDouble
    }
    results = results ::: List(f)
    println("Length of results list: " + results.length)

    results.foreach(future => {
      expectedResult = future() :: expectedResult
      i += 1
    })
  }
  // I would return the list of Doubles here to calculate mean and StDev
  println("### Length of final list: " + expectedResult.length)
}

I didn't think anything of it as it ran fast and I was getting the results I expected. When I took a closer look at it to try and make it run faster (it wasn't using all the CPU resources I had available), I realized that my loop counter was in the wrong spot and that the foreach was inside the future creation loop and as a result blocking the futures early. Or so I thought.

I stuck in a couple of println statements to see if I could figure out what was going on and became very confused about what was going on... The length of the result list did not match the final list length and neither matched up with the loop counter!

I modified my code to the following based on what I thought was (should be) happening and things got much slower and the output of the print statements didn't make any more sense than in the first method. This time the loop counter seems to jump to 1000 although the final list length makes sense.

The second method does use all available CPU resources which is more along the lines of what I would expect but it takes longer for what I am pretty sure is the same result.

def two = {
  var results = List[Future[Double]]()
  var expectedResult: List[Double] = Nil
  var i = 0

  while (i < 1000) {
    val f = future {
      Thread.sleep(scala.util.Random.nextInt(5) * 100)
      println("Loop count: " + i)
      Random.nextDouble
    }
    results = f :: results
    i += 1
    println("Length of results list: " + results.length)

  }
  results.foreach(future => {
    expectedResult = future() :: expectedResult
  })
  // I would return the list of Doubles here to calculate mean and StDev
  println("### Length of final list: " + expectedResult.length)
}

Am I missing something obvious here?

Edit

For anyone looking at this... the problem was that I was re-adding the results of futures to my final list (expectedResult) within the futures loop - as pointed out by som-snytt.

So with each loop through I would repeatedly iterate over the completed futures and get:

//First Loop: 
List(1)
//Second Loop:
List(1,2)
//Third Loop:
List(1,2,3,4)
//... and so on

The pattern in the final list was this:

List(n, n-1, n-2, ..., 4, 3, 2, 1, 3, 2, 1, 2, 1, 1)

Since the list was 5050 items long and Double values it was hard to see the pattern when i only looked at the start of the list.

Ultimately the the number of loops was really only 100 and not the 5000 I needed.

Version two of the method is correct for scala 2.9.

som-snytt · Accepted Answer · 2012-12-17T17:59:00.723

Am I missing something obvious here?

No. It's fair to say that imperative-style programming makes everything non-obvious.

In one, you're iterating over results repeatedly, bumping i.

Last time through:

Length of results list: 45
Loop count: 990
### Length of final list: 1035

i counts the final list, and applying a future adds the length of results, so the math is right: 45 + 990 = 1035.

Applying futures that are completed just gets the value; you block only to wait, so you wouldn't necessarily notice a performance problem getting the future value over and over.

But note that in the future, you're closing over var i, see Captured by Closures, not the value of i when the future is created. As a bonus confusion, "Loop count" is unreliable as reported because of lack of synchronization.

I didn't think anything of it as it ran fast and I was getting the results I expected.

There's so much engineering wisdom packed into that observation.

Here are two other formulations for 2.9:

  def four = (1 to 1000).par map { i =>
    Thread sleep nextInt(5) * 100
    Console println "Loop count: " + i
    nextDouble
  } 

  def three = 
    (1 to 1000) map (i => future {
        Thread sleep nextInt(5) * 100
        Console println "Loop count: " + i
        nextDouble
    }) map (_())

Here is the new API in 2.10, just for comparison.

import scala.concurrent._
import scala.concurrent.duration._
import scala.util._

object Test extends App {
  import ExecutionContext.Implicits.global
  import Random._
  def compute(i: Int) = future {
    Thread.sleep(nextInt(5) * 100)
    val res = nextDouble
    println(s"#$i = $res")
    res
  }
  val f = Future.traverse(1 to 1000)(compute)
  val res = Await result (f, Duration.Inf)
  println(s"Done with ${res.length} results")
}

Thanks for the answer and the explanation. It took me a while to fully grasp but now I understand what was happening. — Mike Lavender, Dec 17 '12 at 12:35
Agreed, I updated the answer to reflect the fact that it was not obvious. Parallelism is hard enough without a big while loop with braces that are far apart. — som-snytt, Dec 17 '12 at 17:24

Scala Futures - confused by CPU load and output of two approaches

Edit

1 Answers1