2

Was at a Scala meetup and we were discussing the "Scala way" of doing things...

Someone asked a different developer how he / she would implement the Fibonacci sequence in Scala... The person answered with the following code (only to be told that while it worked, it was non-optimal):

def fibonacci(n: Int):BigInt = n match {
    case 0 => 0
    case 1 => 1
    case _ => fibonacci(n - 1) + fibonacci(n - 2)
}
  1. What's wrong with this method?

  2. What are the ways to improve this code, the Scala way?

PacificNW_Lover
  • 4,746
  • 31
  • 90
  • 144
  • 1
    Possible duplicate of: http://stackoverflow.com/questions/7388416/what-is-the-fastest-way-to-write-fibonacci-function-in-scala – Aaron Novstrup Jun 27 '14 at 05:02
  • You are saying that the question is a possible duplicate or that the method will produce a possible duplicate? – PacificNW_Lover Jun 27 '14 at 05:05
  • 2
    As for "what's wrong with this method?", that's more of a general algorithms question than a Scala issue. The problem is that it repeats **a lot** of work. – Aaron Novstrup Jun 27 '14 at 05:05

2 Answers2

2

The problem with that function, as described are the non-tail recursive calls. This means that the recursivity involved here needs a stack to work (in your sample, it's the call stack). In other words, that function is roughly the equivalent of:

import scala.collection.mutable.Stack

def fibonacci(n: Int): BigInt = {
  var result = BigInt(0)
  val stack = Stack.empty[Int]
  stack.push(n)

  while (stack.nonEmpty) {
    val x = stack.pop()
    if (x == 1) {
      result += 1
    }
    else if (x > 1) {
      stack.push(x - 2)
      stack.push(x - 1)
    }
  }

  result
}

As you can see, that's not very efficient, is it? On each iteration, the stack's size grows by one and because you can view the calls being made as a tree, that would be a proper binary tree whose size depends on N and the number of leaves on it approximately 2N (actually less but constant factors don't matter when N is big), so we are talking about O(2N) time complexity and O(n) memory complexity (i.e. needed stack size is N). Now that's exponential growth for time and linear growth for the memory used. What that means is that it takes a loooong time to process and it uses more memory than it should. Btw, it's a good idea as a software developer to reason in terms of Big O notation, because that's the first thing you need to look at when talking about performance or memory consumption.

Thankfully, for Fibonnaci we don't need that recursion. Here's a more efficient implementation:

def fibonacci(n: Int): BigInt = {
  var a = BigInt(0)
  var b = BigInt(1)
  var idx = 0

  while (idx < n) {
    val tmp = a
    a = b
    b = tmp + a
    idx += 1
  }

  a
}

This is just a plain loop. It doesn't need a stack to work. The memory complexity is O(1) (meaning it needs a constant amount of memory to work, independent of input). In terms of time this algorithm is O(n), roughly meaning that to process the result a loop involving N iterations is involved, so the growth in time does depend on input N, but is linear and not exponential.

In Scala, you can also describe this as a tail recursion:

import annotation.tailrec 

def fibonacci(n: Int): BigInt = {
  @tailrec
  def loop(a: BigInt, b: BigInt, idx: Int = 0): BigInt = 
    if (idx < n) 
      loop(b, a + b, idx + 1)
    else
      a

  loop(0, 1)
}

This loop is described as a recursive function, however because that's a "tail-recursive call", the compiler rewrites this function to a simple loop. You can also see the presence of the @tailrec annotation. It's not strictly necessary, the compiler will optimize that into a loop without it, but if you use this annotation, the compiler will error if the described function is not a tail-recursion - which is nice, because it's easy to make mistakes when relying on tail-recursion to work (i.e. you make a change and without noticing, bam, the function is not a tail-recursion anymore). Use this annotation, because the compiler can protect you if you do.

So in this case you work with immutable things (no more vars specified), yet it will have the same performance characteristics as the while loop. Which version you prefer, that's up to your preferences. I prefer the later, because it's easier for me to spot invariants and exit conditions, plus I like immutability, but other people prefer the former. And about the idiomatic way of doing this, you can also get fancy with a lazy Stream or Iterable, but nobody in the FP department will complain about tail recursions :-)

Alexandru Nedelcu
  • 8,061
  • 2
  • 34
  • 39
  • 1
    Wrong. It will take funny amounts of time to work long before there will be not enough stack space. – Display Name Jun 27 '14 at 08:17
  • @SargeBorsch a StackOverflowError is a prime example of billion dollars mistakes - leading to buffer overflows in C/C++ and thus vast security implications, or to the app crashing or misbehaving in non-deterministic ways in Java. Anyway I've also mentioned the time complexity right before you posted this comment. – Alexandru Nedelcu Jun 27 '14 at 08:25
  • @SargeBorsch - also, if memory growth is exponential, it will blow up pretty fast ;-) Try it out ... takes less than 1 second. – Alexandru Nedelcu Jun 27 '14 at 08:31
  • 1
    Why exponential memory growth? The tree height is limited by n, isn't it? – Display Name Jun 27 '14 at 09:22
  • @SargeBorsch - indeed, I made an error of judgement, because those calls are not executed in parallel and thus the memory growth both in the original implementation and in the one using the explicit stack is `O(n)`. Goes to show I'm rusty :-) ... Thank you for the correction - will make edits. – Alexandru Nedelcu Jun 27 '14 at 10:52
  • I'd suggest removing the discussion of `StackOverflowError` -- you won't live long enough to see one. Alternatively, you could move it further down in the post after motivating it with a linear time (non-tail-)recursive algorithm. – Aaron Novstrup Jun 27 '14 at 17:01
  • @AaronNovstrup in my tests `fibonacci(10000)` err`d immediately on my machine, the default stack size is fairly limited - but you're right - I removed that paragraph completely. – Alexandru Nedelcu Jun 27 '14 at 18:57
1

This one will calculate many sub-problems more than once. You can think of the algorithm as a tree.

For example if you ask for fibonacci(4) you calculate:

fib(4) = fib(3) + fib(2) = 2 + 1 = 3
  // left subtree:
  fib(3) = fib(2) + fib(1) = 1+1 = 2
    // left
    fib(2) = fib(1) + fib(0) = 0+1 = 1
    // right
    fib(1) = 1
  // right
  fib(2) = fib(1) + fib(0) = 0+1 = 1

as you can see you calculate fib(2) 2-times and this will only get worse on higher numbers.

As for how you can improve this code:

  • either memoize the values
  • use techniques from dynamic-programming (well it's more or less memoization too)
  • or use a better algorithm!

There are many questions on this topic here already - start here: Efficient calculation of Fibonacci series

Or have a look at this question for scala-specific answers: What is the fastest way to write Fibonacci function in Scala?

Random Dev
  • 51,810
  • 9
  • 92
  • 119