2

I was using a recursive function in F Sharp to build a particular tree structure, using containers that were evaluated at each stage. I was instructed to use Seq instead because it lazy evaluation should minimize the number of operation. (I understand that, for example, .NET sort functions using lazy evaluation are quicker than using in-place swaps on an array). This change actually had the opposite effect, slowing the tree-building down a great deal for large inputs. I think I have worked out broadly the problem as demonstrated in the below code:

First we have a "lessthan" function that counts how many times it has been called:

let mutable count=0
let lessthan a b=count<-count+1
                 b<a

Then versions of a simple divide-and-conquer for Seq and Array:

let rec recursion n inSeq = 
  if  n<=1 then inSeq
  else 
     let pivot = n/2
     let left = inSeq |> Seq.filter (fun i->i|>lessthan pivot) |> recursion (n/2)
     let right = inSeq |> Seq.filter (fun i->not (i|>lessthan pivot)) |> recursion (n/2)
     Seq.append left right


 //the same function with arrays to give eager evaluation
let rec recursionA n inAr  =
  if  n<=1 then inAr
  else let pivot = n/2
     let left = inAr |> Array.filter (fun i->i|>lessthan pivot) |> recursionA (n/2)
     let right = inAr |> Array.filter (fun i->not (i|>lessthan pivot)) |> recursionA (n/2)
     Array.append left right

And finally test for how many times it calls the comparison, and for the lazy version when.

let test n= 
    let reverse = Seq.init  n (fun i->i) |>recursion n
    do printf "lazy:"
    for n in reverse do printf "%d (%d) " n count
    do printf "\n\n"
    do count<-0  
    let reverseArray=Array.init  n (fun i->i)|>recursionA n
    do printf "eager:\n%d\n" count

 //IO
do printf "Type an integer\n"
let intstring=System.Console.ReadLine()
let worked, numberofpoints = System.Int32.TryParse(intstring) 
do if worked then test numberofpoints

When I input a power of 2, m=2^ k, into this function the array version of the function calls lessthan 2*m*k. which makes sense (2 k at each level of recursion). When the As m grows the Seq version calls lessthan about m(log m)^2 times, which is presumably something like evaluating each value as

"in 0..2^n-1" AND less than 2^(n-1) AND in 0..2^(n-1)-1 AND less than 2^(n-2) AND..."

(but it isn't QUITE that based on when the number of calls jump up and what the comparisons are we you step though in debug mode)

Is there any way to remove these effects and have the lazy evaluation improve performance as it would for other algorithms?

NB. I checked that incrementing the counter didn't make a difference by counting by hand in debug mode.

I know the array version could have even fewer comparisons using Array.Partition

MatthewJohnHeath
  • 393
  • 2
  • 12
  • the indentation of your `recursion` functions seems a little odd - are you sure you have copied them correctly. – John Palmer Apr 28 '15 at 11:55
  • Quite right. Thanks. Hopefully fixed now. – MatthewJohnHeath Apr 28 '15 at 12:03
  • 1
    `Seq.filter` produces a new sequence, which, when evaluated, always evaluates the whole original sequence, and applies the filter as it goes. In your algorithm, you basically produce a lot of sequences with a lot of filters stacked on each. Therefore, it's absolutely no surprise that the complexity becomes exponential. What I don't understand is, who told you that lazy evaluation would increase performance, and what was their rationale? – Fyodor Soikin Apr 28 '15 at 12:12

1 Answers1

3

The problem with the lazy solution using seq<'T> is that sequences do not do any caching. This means that when you create a sequence and use it twice, it is re-evaluated:

let test = Seq.init 10 (fun i -> printfn "%d" i; i)
test |> Seq.map id |> Seq.length  // Prints 0 .. 9
test |> Seq.map id |> Seq.length  // Prints 0 .. 9 again

One way to avoid this is to use Seq.cache, which returns a sequence that evaluates only once and caches its results (so, it gets faster, but needs more memory).

In your example, you could run Seq.cache in recursion (before you pass the inSeq to two different functions that process it):

let rec recursion n inSeq =   
  if  n<=1 then inSeq
  else 
    let pivot = n/2
    let inSeq = Seq.cache inSeq
    let left = inSeq |> Seq.filter (lessthan pivot) |> recursion (n/2)
    let right = inSeq |> Seq.filter (lessthan pivot >> not) |> recursion (n/2)
    Seq.append left right

(I also used more sane indentation - indenting the whole body of the function so that it is after the = sign apparently works, but it is not really a recommended pattern. Adding a newline makes code a lot more readable...)

Tomas Petricek
  • 240,744
  • 19
  • 378
  • 553