How many "instances" of Node are created?
Generally when you write a data constructor such as NodeRef
or Node
, this represents a computation that will allocate such a value—cf. new Node(…)
in a typical imperative/OOP language. When you write a variable binding with let
or where
, mentioning that name again will refer to the same value.
A top-level binding, or a where
binding that doesn’t depend on any of the parameters, is similarly shared, and also known as a “constant applicative form” (CAF). So ref0
, ref1
, ref2
, and refs
are all CAFs and any mention of e.g. ref0
will refer to the same NodeRef
value.
On each invocation of deref
, it allocates a list allNodes
which contains as many Node
values as there are NodeRef
values in the input list refs
—three in this case. The expression allNodes !! targetIndex
refers to indices of the same shared allNodes
list.
Finally, since deref
returns head allNodes
, only those nodes that are actually connected to the head node will remain reachable, and the others will be garbage-collected.
When I am running "showme" am I creating more nodes with each step?
showme
allocates a number of list “cons” cells (:)
equal to the count
argument. Or, in case count
is negative, it will continue decrementing until wrapping around the range of Int
and back to 0
—you can use a guard with <=
instead, or pred
instead of - 1
to use a checked decrement. It does not copy the nodes themselves, only manipulating references to them (and their fields).
What is the right way of building a data structure with circular references?
Your approach is correct here. If you want to use these circular data structures, you will need to be careful not to use naïve unbounded recursion when traversing them, so as to avoid nontermination. You also won’t be able to easily change this data structure—that is, construct an amended version of it—without losing the sharing. Therefore it’s common to just use the ID-based representation you have in NodeRef
, which is an example of “observable sharing”, stored in a parent structure such as an IntMap
. You might move to Node
if you want to “freeze” the representation.
Knot-tying is more commonly used as a convenience, to avoid the need for the explicit mutation and manual sequencing used in imperative languages. As long as the dataflow isn’t circular, then you can just write the definition of a whole data structure and let the ordinary process of lazy evaluation fill it in.