10

Suppose I would like to calculate Pi with Monte Carlo simulation as an exercise.

I am writing a function, which picks a point in a square (0, 1), (1, 0) at random and tests if the point is inside the circle.

import scala.math._
import scala.util.Random

def circleTest() = {
  val (x, y) = (Random.nextDouble, Random.nextDouble)
  sqrt(x*x + y*y) <= 1
}

Then I am writing a function, which takes as arguments the test function and the number of trials and returns the fraction of the trials in which the test was found to be true.

def monteCarlo(trials: Int, test: () => Boolean) =
  (1 to trials).map(_ => if (test()) 1 else 0).sum * 1.0 / trials

... and I can calculate Pi

monteCarlo(100000, circleTest) * 4

Now I wonder if monteCarlo function can be improved. How would you write monteCarlo efficient and readable ?

For example, since the number of trials is large is it worth using a view or iterator instead of Range(1, trials) and reduce instead of map and sum ?

Michael
  • 41,026
  • 70
  • 193
  • 341
  • 4
    A major speed up should come from removing the sqrt. `sqrt(x*x + y*y) <= 1` is the same as `x*x + y*y <= 1` (i.e. squaring both sides) – The Archetypal Paul Sep 03 '14 at 15:23
  • 1
    Good point ! Thank you. – Michael Sep 03 '14 at 16:15
  • 3
    I've put a microbenchmarking page with the various methods here: https://gist.github.com/willf/e1f2ce95f04442af53e5 I'm glad to say recursion is the fastest :) – Will Fitzgerald Sep 03 '14 at 17:31
  • @WillFitzgerald, interesting, thanks for that. I'm actually quite surprised how fast the Stream based version is. Recursion is probably always going to be fastest at this sort of thing, as there's less overhead, but, as they say, "recursion is the goto of functional programming" - efficient, but not necessarily clearest or quickest to a working program. – The Archetypal Paul Sep 03 '14 at 18:04
  • 2
    When optimising, think about what it compiles to in machine code. The fastest things in machine code are always simple while loops. Method calls, object allocations (often necessary for lambdas), etc, are very expensive in tight loops like this. So the fastest would be a while loop. The recursion methods below are only fast because of tail call optimisation, which means they don't compile to a recursive method call in the byte code, but to a while loop. – James Roper Sep 04 '14 at 01:09
  • I guess I don't know who "they" are, but the recursive version I wrote is, what, 4 lines long. The general recursive pattern of finding the base case, and then recursing on the non-base case is a habit easy to pick up. – Will Fitzgerald Sep 04 '14 at 03:24
  • I've updated the gist at https://gist.github.com/willf/e1f2ce95f04442af53e5 to include the referentially transparent version. – Will Fitzgerald Sep 04 '14 at 03:37

6 Answers6

8

It's worth noting that Random.nextDouble is side-effecting—when you call it it changes the state of the random number generator. This may not be a concern to you, but since there are already five answers here I figure it won't hurt anything to add one that's purely functional.

First you'll need a random number generation monad implementation. Luckily NICTA provides a really nice one that's integrated with Scalaz. You can use it like this:

import com.nicta.rng._, scalaz._, Scalaz._

val pointInUnitSquare = Rng.choosedouble(0.0, 1.0) zip Rng.choosedouble(0.0, 1.0)

val insideCircle = pointInUnitSquare.map { case (x, y) => x * x + y * y <= 1 }

def mcPi(trials: Int): Rng[Double] =
  EphemeralStream.range(0, trials).foldLeftM(0) {
    case (acc, _) => insideCircle.map(_.fold(1, 0) + acc)
  }.map(_ / trials.toDouble * 4)

And then:

scala> val choosePi = mcPi(10000000)
choosePi: com.nicta.rng.Rng[Double] = com.nicta.rng.Rng$$anon$3@16dd554f

Nothing's been computed yet—we've just built up a computation that will generate our value randomly when executed. Let's just execute it on the spot in the IO monad for the sake of convenience:

scala> choosePi.run.unsafePerformIO
res0: Double = 3.1415628

This won't be the most performant solution, but it's good enough that it may not be a problem for many applications, and the referential transparency may be worth it.

Travis Brown
  • 138,631
  • 12
  • 375
  • 680
  • Thank you very much ! I was thinking about the side-effects of `Random` and how to make the Monte-Carlo pure. I do not accept this answer since it's out of the question scope but I appreciate it and start learning Random monad. – Michael Sep 04 '14 at 13:42
3

Stream based version, for another alternative. I think this is quite clear.

def monteCarlo(trials: Int, test: () => Boolean) =
    Stream
      .continually(if (test()) 1.0 else 0.0)
      .take(trials)
      .sum / trials

(the sum isn't specialised for streams but the implementation (in TraversableOnce) just calls foldLeft that is specialised and "allows GC to collect along the way." So the .sum won't force the stream to be evaluated and so won't keep all the trials in memory at once)

The Archetypal Paul
  • 41,321
  • 20
  • 104
  • 134
  • It is interesting that `reduceLeft` is _specialised_ that way. I like the solution but the fact that `reduceLeft` is _specialised_ makes it less intuitive, I think. – Michael Sep 03 '14 at 16:19
  • 1
    Not sure what you mean. reduceLeft works everywhere and does the same thing, it's just that the stream version has a specific implementation for efficiency. It's the same for many other collection methods - they may have specialised implementations for some collection types. You can replace this reduceLeft with sum, it will just force the stream contents to be created and held in memory (I think, not checked that) – The Archetypal Paul Sep 03 '14 at 16:19
  • Ah, no, just looked at the source and .sum just calls foldLeft so should be efficient on streams too. I'll edit my answer – The Archetypal Paul Sep 03 '14 at 16:24
  • I do not want to hold all 10000 elements of the stream in memory, of course. So `reduceLeft` is fine from the performance point of view. On the other hand I am afraid people just _don't know_ about this _special_ behaviour of `reduceLeft`. So they will think that the stream hold all its elements in memory. – Michael Sep 03 '14 at 16:25
  • BTW, why not use an `(1 to trial).iterator` rather than a `stream`? I hope it is lazy too. – Michael Sep 03 '14 at 16:38
  • My microbenchmarking indicates iterator is slower than a simple range (but it is faster than the stream) – Will Fitzgerald Sep 03 '14 at 17:59
  • @Michael, mostly because I wanted to do a Stream-based solution :) as it fits my conception of the problem (a stream of samples, which you only later decide what to do with) Also with a stream, you can compute 'enough' iterations in some dynamic way - testing how good your current estimate is, for instance, rather than determining the number of trials up front. Not applicable to your example maybe, but it's useful in other contexts. – The Archetypal Paul Sep 03 '14 at 18:09
  • 1
    @WillFitzgerald Thank you for taking your time to benchmark it ! – Michael Sep 03 '14 at 19:40
  • @Paul Don't you think that `take(10000)` on a `stream` allocate 10000 elements in memory ? – Michael Sep 03 '14 at 19:42
  • No, I don't think it does. `Stream.continually{println("evaluated");1}.take(10000) //> evaluated //| res0: scala.collection.immutable.Stream[Int] = Stream(1, ?)` - prints "evaluated" only once. It's lazy, only the head is evaluated until you access the other elements. – The Archetypal Paul Sep 03 '14 at 19:48
  • In an earlier variant (the one with foldLeft) I put printlns in both the sample-generation and summation code. They alternate, showing that elements were only generated when they were needed to add to the cumulative total. – The Archetypal Paul Sep 03 '14 at 19:53
2

I see no problem with the following recursive version:

def monteCarlo(trials: Int, test: () => Boolean) = {
  def bool2double(b: Boolean) = if (b) 1.0d else 0.0d
  @scala.annotation.tailrec
  def recurse(n: Int, sum: Double): Double = 
    if (n <= 0) sum / trials
    else recurse(n - 1, sum + bool2double(test()))
  recurse(trials, 0.0d)
}
Will Fitzgerald
  • 1,372
  • 10
  • 14
2

And a foldLeft version, too:

def monteCarloFold(trials: Int, test: () => Boolean) = 
  (1 to trials).foldLeft(0.0d)((s,i) => s + (if (test()) 1.0d else 0.0d)) / trials

This is more memory efficient than the map version in the question.

Will Fitzgerald
  • 1,372
  • 10
  • 14
1

Using tail recursion might be an idea:

def recMonteCarlo(trials: Int, currentSum: Double, test:() => Boolean):Double = trials match {
  case 0 => currentSum
  case x => 
    val nextSum = currentSum + (if (test()) 1.0 else 0.0)
    recMonteCarlo(trials-1, nextSum, test)

def monteCarlo(trials: Int, test:() => Boolean) = {
  val monteSum = recMonteCarlo(trials, 0, test)
  monteSum / trials
}
Ashalynd
  • 12,363
  • 2
  • 34
  • 37
1

Using aggregate on a parallel collection, like this,

def monteCarlo(trials: Int, test: () => Boolean) = {
  val pr = (1 to trials).par
  val s = pr.aggregate(0)( (a,_) => a + (if (test()) 1 else 0), _ + _) 
  s * 4.0 / trials
}

where partial results are summed up in parallel with other test calculations.

elm
  • 20,117
  • 14
  • 67
  • 113