3

I created an instance of Functor for the following definition of a DoubeList (doubly linked list) data type recently. (Motivated by this post) My goal was to define an instance of Functor so that the function f in fmap f db is applied to the first field in each node in the doubly linked list - regardless of which value constructor is passed to fmap. Afterward, I was having trouble explaining to myself how my fmap definition would actually evaluate.

For example, passing in some value constructed with LeftEnd will result in fmap on a Middle or RightEnd. Likewise, a fmap on some value with a Middle or RightEnd constructor will result in an fmap on a previous or next node - and so on and so forth.

However, after exploring the result of fmapping on LeftEnd, MiddleEnd or RightEnd using returnFirst and returnNext, it seems that for some function f in fmap f db, f is applied to the first field in each node only once - like I set out to do. But I can't seem to grasp how this is possible.

My intuition is telling me that the lazy nature of Haskell is coming into play here. Any insight or explanation would be greatly appreciated!

module DoubleList where

    data DoubleList a
        = LeftEnd  a (DoubleList a)
        | Middle   a (DoubleList a) (DoubleList a)
        | RightEnd a (DoubleList a)

    -- perserve identity
    -- fmap id create = id create
    -- perserve function composition
    -- fmap (f . g) create = fmap f . fmap g $ create
    instance Functor DoubleList where
        fmap f (LeftEnd a nxt) = LeftEnd (f a) (fmap f nxt)
        fmap f (Middle a prev nxt) = Middle (f a) (fmap f prev) (fmap f nxt)
        fmap f (RightEnd a prev) = RightEnd (f a) (fmap f prev)
    
    instance Show a =>  Show (DoubleList a) where 
        show (LeftEnd x next) = "Left End " ++ show x ++ "<->" ++ show next
        show (Middle x prev next) = show x ++ "<->" ++ show next
        show (RightEnd x next) = show x  ++ " Right End"
    
    create :: DoubleList Integer
    create = let n1 = LeftEnd  1 n2
                 n2 = Middle   2 n1 n3
                 n3 = Middle   3 n2 n4
                 n4 = Middle   4 n3 n5
                 n5 = RightEnd 5 n4
              in n1

    returnFirst :: DoubleList a -> DoubleList a
    returnFirst (Middle _ prev _) = returnFirst prev
    returnFirst (RightEnd _ prev) = returnFirst prev 
    returnFirst firstNode = firstNode

    returnPrev :: DoubleList a -> DoubleList a
    returnPrev (Middle x fst@(LeftEnd _ _) _ ) = fst
    returnPrev (Middle x mid@(Middle _ _ _) _ ) = mid
    returnPrev (RightEnd x prev) = prev
    returnPrev leftEnd = leftEnd

    returnNext :: DoubleList a -> DoubleList a
    returnNext (LeftEnd x nxt) = nxt
    returnNext (Middle x prev nxt) = nxt
    returnNext (RightEnd _ prev) =  prev
  • 1
    Since two consecutive nodes refer to each other, you can not just recurse on the two, since that would mean that it results in infinite recursion. But yes, lazyness means it will only `fmap f prev`, etc. on demand. – Willem Van Onsem Apr 30 '22 at 13:45
  • Got it. Thanks, @WillemVanOnsem! I was expecting a comment/response from you! haha. – Sevan Golnazarian Apr 30 '22 at 13:56
  • "is applied only once" what do you mean by that? how exactly are you checking this and what are you observing? – Will Ness Apr 30 '22 at 14:57
  • @WillNess, `returnFirst $ returnNext $ returnNext $ fmap (+1) create` results in `Left End 2<->3<->4<->5<->6 Right End`. `f` is applied to each node's value field only once. My recursive definition of `fmap` made me think it would be possible that `(+1)` could be applied to each node more than once. – Sevan Golnazarian Apr 30 '22 at 15:05
  • I feel it's important to point out that this isn't really a typical doubly linked list. It's more like some sort of weird infinite tree (due to Haskell's immutability and implicit recursive nature of `let`). If you want to see your `fmap (+1)` acting "more than once", try doing `returnFirst $ returnNext $ returnPrev $ returnNext $ returnPrev $ returnNext $ ...` or something similar. – DDub Apr 30 '22 at 15:59
  • please ask a specific question based on a specific piece of code, test call, and its output -- actual and expected. put this in the body of the post please. – Will Ness Apr 30 '22 at 17:33
  • @DDub I think the list defined by `create` is the usual doubly-linked list. it does have the proper sharing of nodes. – Will Ness Apr 30 '22 at 17:36
  • You might find [this question and its answer](https://stackoverflow.com/q/28243314/791604) interesting reading. In particular it discusses ways in which you can have two different "shapes" of objects in memory that appear identical to all pure observations. That's what's happening to you: you have an object which appears to all pure observations to be a suitably `fmap`'d doubly linked list, but is actually an infinitely large tree. – Daniel Wagner May 01 '22 at 07:49

1 Answers1

2
instance Functor DoubleList where
    fmap f (LeftEnd a nxt) = LeftEnd (f a) (fmap f nxt)
    fmap f (Middle a prev nxt) = Middle (f a) (fmap f prev) (fmap f nxt)
    fmap f (RightEnd a prev) = RightEnd (f a) (fmap f prev)

Each call to a constructor represents a new node, and each call to fmap represents a separate traversal over the data structure. In your definition of create, you use let bindings to ensure that the nodes are shared, but your definition of fmap doesn’t preserve that sharing. However, you can’t actually observe this from normal safe code.

The reason it works is that the fields of your data structure are lazy, so your fmap can produce a result incrementally: pattern-matching on the result of fmap f create just evaluates up to the constructor (LeftEnd / Middle / RightEnd) and doesn’t evaluate the fields. This means that even though create contains reference cycles, you won’t run into an infinite loop when mapping a function over it and examining the results.

The reason it doesn’t quite work is that it duplicates work. If it could preserve the sharing in the input, the result would be structured like this:

fmap f create
===
let n1' = LeftEnd  (f 1) n2'
    n2' = Middle   (f 2) n1' n3'
    n3' = Middle   (f 3) n2' n4'
    n4' = Middle   (f 4) n3' n5'
    n5' = RightEnd (f 5) n4'
in n1'

But as it is, it’s structured like this:

let n1' = LeftEnd  (f 1) n2'
    n2' = Middle   (f 2) (fmap f n1) (fmap f n3)
in n1'

Notice that n2' doesn’t contain any reference to the new n1', it contains a thunk that will construct a value that happens to be equivalent to n1'. So what you’ve written is a valid Functor instance, but it will perhaps use more memory than you expected, depending on how you traverse the results. This data structure isn’t really a doubly-linked list, it’s a (possibly infinite) tree of computations, where each node may have 1 or 2 children.

By factoring, it’s equivalent to this, which you can think of as a type of infinite streams of values that may split into two at each point:

-- An infinite <3 -ary tree.
data LoveTree a
  = LoveTree a (These (LoveTree a) (LoveTree a))

-- Using ‘Data.These.These’:
data These a b = This a | That b | These a b
Jon Purdy
  • 53,300
  • 8
  • 96
  • 166