What does this binary-search function exactly?

Question

I struggle to understand how this binary-search function works:

bsearch :: Ord a => [a] -> a -> Bool
bsearch [] _ = False
bsearch xs x =
   if x < y then bsearch ys1 x
   else if x > y then bsearch ys2 x
   else True
  where
    ys1 = take l xs
    (y:ys2) = drop l xs
    l = length xs `div` 2

I tried to think it by an example: bsearch [1,2,3,4] 4 but I don't understand where the function starts. I like to believe first l = length xs 'div' 2 gets calculated. l = 2is the result. Now I put my variables in (y:ys2) = drop l xs where (y:ys2) = 3:[4] which equals to drop 2 [1,2,3,4] = [3,4]. Next else if 4 > 3 then bsearch ys2 x gets executed where ys2 = [4] and x = 4. What happens next? How does x = 4and ys2 = [4]get compared?

EDIT: I think since bsearch [4] 4is the new bsearch xs x the new l = length xs 'div' 2 = length [4] 'div' 2 = 0 that executes drop 0 [4] = [4] = (4:[]). 4 < 4 and 4 > 4is False therefore else True. Is this the way this function executes for my example?

I would be very happy if someone could help me with this function.

This is not really a good binary search, since typically binary search runs in logarithmic time, and for a list, that is impossible, since obtaining the *i*-th element, takes *O(i)* time. — Willem Van Onsem, May 27 '18 at 19:29
The list `[1,2,3,4]` is split into three parts: `[1,2]`, `3` and `[4]`. If the number is 3, it returns true. Otherwise, it searches the relevant sublist. — that other guy, May 27 '18 at 19:32
I think your description is correct: the function is roughly evaluated that way. Also note that the evaluation order is not that important, since whatever order you take, as long as you eventually terminate, your result will be correct (since there are no side effects). — chi, May 27 '18 at 20:01
Possible duplicate of [Understanding order of evaluation in Haskell](https://stackoverflow.com/questions/48384074/understanding-order-of-evaluation-in-haskell) — jberryman, May 27 '18 at 23:39
@WillemVanOnsem still it looks like the best you can do for a list. E.g. `bsearchArray . toArray` would be no better. — luqui, May 28 '18 at 06:07
@luqui Linear search would be better: just consider elements one at a time, in order. Attempting to do a binary search is just slower. — amalloy, May 28 '18 at 07:46
@luqui: I think a linear search will probably be sufficient here (and a cutoff once we run into larger values). But this `take`, `drop` (and especially `length` behavior) is probably making it worse. — Willem Van Onsem, May 28 '18 at 08:28
@WillemVanOnsem ah yes, the `length` thing, missed that. Seems irrecoverable. — luqui, May 28 '18 at 08:35
The slow part of linear search is the linear iteration, which `bsearch` on a linked list (via `take` and `drop`) has to do anyway. A proper binary search simply requires constant-time access to the target elements of the container being searched. (Note this doesn't require random access; a binary search tree works because you only ever need access to the children of the current node, not access to any arbitrary node in the tree.) — chepner, May 28 '18 at 16:17

Yann Vernier · Accepted Answer · 2018-05-28T13:06:09.030

Your interpretation of how the bindings expand is correct. The function essentially operates by converting a finite sorted list on demand into a binary search tree. I could rewrite portions of the function just to show that tree structure (note that the where portion is unchanged):

data Tree a = Node (Tree a) a (Tree a) | Empty
  deriving Show

tree [] = Empty
tree xs = Node (tree ys1) y (tree ys2)
  where
    ys1 = take l xs
    (y:ys2) = drop l xs
    l = length xs `div` 2

The tree form can then be produced:

*Main> tree [1..4]
Node (Node (Node Empty 1 Empty) 2 Empty) 3 (Node Empty 4 Empty)

The recursive upper section is about traversing only the relevant portion of the tree.

bsearchT Empty _ = False
bsearchT (Node ys1 y ys2) x =
   if      x < y then bsearchT ys1 x
   else if x > y then bsearchT ys2 x
   else               True

bsearch xs x = bsearchT (tree xs) x

The operation itself does suggest that a plain list is not the appropriate data type; we can observe that Data.List.Ordered.member performs a linear search, because lists must be traversed from the head and may be infinite. Arrays or vectors provide random access, so there is indeed a Data.Vector.Algorithms.Search.binarySearch.

What does this binary-search function exactly?

1 Answers1