I made a mistake while implementing scala futures, or at least i think I did, and just noticed it however, when I fix the mistake it runs much slower than when I don't use futures. Can someone help me to understand what is going on?
I have a slow method that I need to run 5,000 times. Each one is independent and returns a Double. I then need to calculate the mean and standard deviation of the 5,000 returned values.
When I coded it initially I did it this way:
import actors.Futures._
import util.Random
import actors.Future
def one = {
var results = List[Future[Double]]()
var expectedResult: List[Double] = Nil
var i = 0
while (i < 1000) {
val f = future {
Thread.sleep(scala.util.Random.nextInt(5) * 100)
println("Loop count: " + i)
Random.nextDouble
}
results = results ::: List(f)
println("Length of results list: " + results.length)
results.foreach(future => {
expectedResult = future() :: expectedResult
i += 1
})
}
// I would return the list of Doubles here to calculate mean and StDev
println("### Length of final list: " + expectedResult.length)
}
I didn't think anything of it as it ran fast and I was getting the results I expected. When I took a closer look at it to try and make it run faster (it wasn't using all the CPU resources I had available), I realized that my loop counter was in the wrong spot and that the foreach
was inside the future
creation loop and as a result blocking the futures early. Or so I thought.
I stuck in a couple of println statements to see if I could figure out what was going on and became very confused about what was going on... The length of the result list did not match the final list length and neither matched up with the loop counter!
I modified my code to the following based on what I thought was (should be) happening and things got much slower and the output of the print statements didn't make any more sense than in the first method. This time the loop counter seems to jump to 1000 although the final list length makes sense.
The second method does use all available CPU resources which is more along the lines of what I would expect but it takes longer for what I am pretty sure is the same result.
def two = {
var results = List[Future[Double]]()
var expectedResult: List[Double] = Nil
var i = 0
while (i < 1000) {
val f = future {
Thread.sleep(scala.util.Random.nextInt(5) * 100)
println("Loop count: " + i)
Random.nextDouble
}
results = f :: results
i += 1
println("Length of results list: " + results.length)
}
results.foreach(future => {
expectedResult = future() :: expectedResult
})
// I would return the list of Doubles here to calculate mean and StDev
println("### Length of final list: " + expectedResult.length)
}
Am I missing something obvious here?
Edit
For anyone looking at this... the problem was that I was re-adding the results of futures to my final list (expectedResult) within the futures loop - as pointed out by som-snytt.
So with each loop through I would repeatedly iterate over the completed futures and get:
//First Loop:
List(1)
//Second Loop:
List(1,2)
//Third Loop:
List(1,2,3,4)
//... and so on
The pattern in the final list was this:
List(n, n-1, n-2, ..., 4, 3, 2, 1, 3, 2, 1, 2, 1, 1)
Since the list was 5050 items long and Double values it was hard to see the pattern when i only looked at the start of the list.
Ultimately the the number of loops was really only 100 and not the 5000 I needed.
Version two of the method is correct for scala 2.9.