Haskell self-referential List termination

Question

EDIT: see this followup question that simplifies the problem I am trying to identify here, and asks for input on a GHC modification proposal.

So I was trying to write a generic breadth-first search function and came up with the following:

bfs :: (a -> Bool) -> (a -> [a]) -> [a] -> Maybe a
bfs predf expandf xs = find predf bfsList
    where bfsList = xs ++ concatMap expandf bfsList

which I thought was pretty elegant, however in the does-not-exist case it blocks forever.

After all the terms have been expanded to [], concatMap will never return another item, so concatMap is blocking waiting for another item from itself? Could Haskell be made smart enough to realize the list generation is blocked reading the self-reference and terminate the list?

The best replacement I've been able to come up with isn't quite as elegant, since I have to handle the termination case myself:

    where bfsList = concat.takeWhile (not.null) $ iterate (concatMap expandf) xs

For concrete examples, the first search terminates with success, and the second one blocks:

bfs (==3) (\x -> if x<1 then [] else [x/2, x/5]) [5, 3*2**8]
bfs (==3) (\x -> if x<1 then [] else [x/2, x/5]) [5, 2**8]

I now think that adding `takeWhile (not.null)` *is* the right and easiest solution here, and it *is* a perfectly sensible thing to wish for that the `concat . iterate (const [])` should in fact terminate (and be equivalent to `id` for lists). — Will Ness, Oct 20 '17 at 16:06

Will Ness · Answer 1 · 2017-10-20T15:44:11.180

We produce the results list (queue) in steps. On each step we consume what we have produced on the previous step. When the last expansion step added nothing, we stop:

bfs :: (a -> Bool) -> (a -> [a]) -> [a] -> Maybe a
bfs predf expandf xs = find predf queue
    where 
    queue = xs ++ gen (length xs) queue                 -- start the queue with `xs`, and
    gen 0 _ = []                                        -- when nothing in queue, stop;
    gen n q = let next = concatMap expandf (take n q)   -- take n elemts from queue,
              in  next ++                               -- process, enqueue the results,
                         gen (length next) (drop n q)   -- advance by `n` and continue

Thus we get

~> bfs (==3) (\x -> if x<1 then [] else [x/2, x/5]) [5, 3*2**8]
Just 3.0

~> bfs (==3) (\x -> if x<1 then [] else [x/2, x/5]) [5, 2**8]
Nothing

One potentially serious flow in this solution is that if any expandf step produces an infinite list of results, it will get stuck calculating its length, totally needlessly so.

In general, just introduce a counter and increment it by the length of solutions produced at each expansion step (length . concatMap expandf or something), decrementing by the amount that was consumed. When it reaches 0, do not attempt to consume anything anymore because there's nothing to consume at that point, and you should instead terminate.

This counter serves in effect as a pointer back into the queue being constructed. A value of n indicates that the place where the next result will be placed is n notches ahead of the place in the list from which the input is taken. 1 thus means that the next result is placed directly after the input value.

The following code can be found in Wikipedia's article about corecursion (search for "corecursive queue"):

data Tree a b = Leaf a  |  Branch b (Tree a b) (Tree a b)

bftrav :: Tree a b -> [Tree a b]
bftrav tree = queue
  where
    queue = tree : gen 1 queue                -- have one value in queue from the start

    gen  0   _                 =         []           
    gen len (Leaf   _     : s) =         gen (len-1) s   -- consumed one, produced none
    gen len (Branch _ l r : s) = l : r : gen (len+1) s   -- consumed one, produced two

This technique is natural in Prolog with top-down list instantiation and logical variables which can be explicitly in a not-yet-set state. See also tailrecursion-modulo-cons.

gen in bfs can be re-written to be more incremental, which is usually a good thing to have:

    gen 0  _     = []
    gen n (y:ys) = let next = expandf y
                   in  next ++ gen (n - 1 + length next) ys

Couldn't the language be made to do the "counter increment" steps for us in some cases? At the very least, GHC should be able to tell if a re-entrant read beyond the current end of the list will block forever, and throw an error. I edited my question to a much simpler example to illustrate the class of blocking list generators I am referring to and would be very interested in your thoughts. — Erik, Sep 28 '17 at 18:12
Hopefully when you post the new question it will attract some experts' attention. :) I think GHCi sometimes *is* able to detect these kinds of situations, but I'm not sure about the particulars. — Will Ness, Sep 28 '17 at 20:40

K. A. Buhr · Answer 2 · 2017-09-29T13:24:39.373

Edited to add a note to explain my bfs' solution below.

The way your question is phrased ("could Haskell be made smart enough"), it sounds like you think the correct value for a computation like:

bfs (\x -> False) (\x -> []) []

given your original definition of bfs should be Nothing, and Haskell is just failing to find the correct answer.

However, the correct value for the above computation is bottom. Substituting the definition of bfs (and simplifying the [] ++ expression), the above computation is equal to:

find (\x -> False) bfsList
   where bfsList = concatMap (\x -> []) bfsList

Evaluating find requires determining if bfsList is empty or not, so it must be forced to weak head normal form. This forcing requires evaluating the concatMap expression, which also must determine if bfsList is empty or not, forcing it to WHNF. This forcing loop implies bfsList is bottom, and therefore so is find.

Haskell could be smarter in detecting the loop and giving an error, but it would be incorrect to return [].

Ultimately, this is the same thing that happens with:

foo = case foo of [] -> []

which also loops infinitely. Haskell's semantics imply that this case construct must force foo, and forcing foo requires forcing foo, so the result is bottom. It's true that if we considered this definition an equation, then substituting foo = [] would "satisfy" it, but that's not how Haskell semantics work, for the same reason that:

bar = bar

does not have value 1 or "awesome", even though these values satisfy it as an "equation".

So, the answer to your question is, no, this behavior couldn't be changed so as to return an empty list without fundamentally changing Haskell semantics.

Also, as an alternative that looks pretty slick -- even with its explicit termination condition -- maybe consider:

bfs' :: (a -> Bool) -> (a -> [a]) -> [a] -> Maybe a
bfs' predf expandf = look
  where look [] = Nothing
        look xs = find predf xs <|> look (concatMap expandf xs)

This uses the Alternative instance for Maybe, which is really very straightforward:

Just x  <|> ...     -- yields `Just x`
Nothing <|> Just y  -- yields `Just y`
Nothing <|> Nothing -- yields `Nothing` (doesn't happen above)

so look checks the current set of values xs with find, and if it fails and returns Nothing, it recursively looks in their expansions.

As a silly example that makes the termination condition look less explicit, here's its double-monad (Maybe in implicit Reader) version using listToMaybe as the terminator! (Not recommended in real code.)

bfs'' :: (a -> Bool) -> (a -> [a]) -> [a] -> Maybe a
bfs'' predf expandf = look
  where look = listToMaybe *>* find predf *|* (look . concatMap expandf)

        (*>*) = liftM2 (>>)
        (*|*) = liftM2 (<|>)
        infixl 1 *>*
        infixl 3 *|*

How does this work? Well, it's a joke. As a hint, the definition of look is the same as:

  where look xs = listToMaybe xs >> 
                  (find predf xs <|> look (concatMap expandf xs))

Thank you for your insights. I am intrigued by your alternatives, but have not quite reached far enough in my Haskell to grasp them, will have to revisit later. I would be very interested in what you think of my edited question, perhaps explaining why `list = 1 : tail list` needs to block forever would illuminate some aspect of Haskell I do not yet understand. — Erik, Sep 28 '17 at 18:06
`bfsList = concatMap (\x -> []) bfsList` is actually an excellent example for this, because with a *little* bit of human reasoning we can see that `concatMap (\x -> []) bfsList` will produce an empty list no matter what, and so `bfsList` *must* be `[]`. Of course this is not how *Haskell* works though. — Will Ness, Sep 28 '17 at 20:45
@Erik what you want is full equational reasoning *all the way down*. But that's not Haskell, unfortunately (or not, YMMV). — Will Ness, Sep 28 '17 at 20:50
I think `foo = case foo of [] -> []` is not the same, because it is non-exhaustive. `foo = case foo of [] -> []; (_:_) -> []` would be the same. — Will Ness, Sep 28 '17 at 21:00
You miss one `xs` on the last line. Great use of `find p (a++b) = find p a <|> find p b` BTW, obviating the `++` and the `length`. — Will Ness, Sep 29 '17 at 07:17
@Erik note that `foo = 1 : undefined` is just a valid solution to the equation `foo = 1 : tail foo` as `[1,1..]` or `[]`: `1 : undefined = 1 : tail (1 : undefined)`. Haskell chooses the former, because it's the *least defined* (as it contains `undefined`) (why we say "*least* fixed point"). If you use the time-travel metaphor, trying to force something in its own definition creates a paradox, punching a `_|_` shaped hole in your universe. For your function, it's equationally valid to have `undefined` sitting as the tail of your list, and it's chosen because it is less defined than `[]`. — HTNW, Oct 21 '17 at 00:19

Mark Seemann · Answer 3 · 2017-09-27T15:33:20.580

~~bfsList is defined recursively, which is not in itself a problem in Haskell. It does, however, produce an infinite list, which, again, isn't in itself a problem, because Haskell is lazily evaluated.~~

As long as find eventually finds what it's looking for, it's not an issue that there's still an infinity of elements, because at that point evaluation stops (or, rather, moves on to do other things).

AFAICT, the problem in the second case is that the predicate is never matched, so bfsList just keeps producing new elements, and find keeps on looking.

After all the terms have been expanded to [] concatMap will never return another item

Are you sure that's the correct diagnosis? As far as I can tell, with the lambda expressions supplied above, each input element always expand to two new elements - never to []. The list is, however, infinite, so if the predicate goes unmatched, the function will evaluate forever.

Could Haskell be made smart enough to realize the list generation is blocked reading the self-reference and terminate the list?

It'd be nice if there was a general-purpose algorithm to determine whether or not a computation would eventually complete. Alas, as both Turing and Church (independently of each other) proved in 1936, such an algorithm can't exist. This is also known as the Halting problem. I'm not a mathematician, though, so I may be wrong, but I think it applies here as well...

The best replacement I've been able to come up with isn't quite as elegant

Not sure about that one... If I try to use it instead of the other definition of bfsList, it doesn't compile... Still, I don't think the problem is the empty list.

here we know, because it is *we* that compute. on each step we consume what *we* have produced on the previous step. when the last expansion step added nothing, we stop. I've added an answer with the working code. — Will Ness, Sep 27 '17 at 12:41
and yes the OP diagnosis was correct. have you missed the `if x<1 then []` part? these are not integers, because `(**) :: Floating a => a -> a -> a`, thus `2**8 :: Floating a => a`, and so `2**8 /5 /5 /5 /5 = 0.4096`. — Will Ness, Sep 27 '17 at 15:28
@WillNess Yes, I totally missed that. I don't know how I managed to do that, because I wasn't even in a particular hurry; I really have no excuse. — Mark Seemann, Sep 27 '17 at 15:31
I blame the (lack of) whitespace in `x<1`. I once too wrote dense code, like OP. I now even add extraneous spaces often. — Will Ness, Sep 27 '17 at 15:33
@WillNess Thank you for keeping me honest. I considered deleting the answer, but then I thought that maybe the part about the Halting problem is marginally useful, so I ended up striking most of the other content. — Mark Seemann, Sep 27 '17 at 15:34

Haskell self-referential List termination

3 Answers3