When you write a = f b
in a lazy functional language like Haskell or Nix, the meaning is stronger than just assignment. a
and f b
will be the same thing. This is usually called a binding.
I'll focus on a Nix example, because you're asking about recursive sets specifically.
A simple attribute set
Let's look at the initialization of an attribute set first. When the Nix interpreter is asked to evaluate this file
{ a = 1 + 1; b = true; }
it parses it and returns a data structure like this
{ a = <thunk 1>; b = <thunk 2>; }
where a thunk is a reference to the relevant syntax tree node and a reference to the "environment", which behaves like a dictionary from identifiers to their values, although implemented more efficiently.
Perhaps the reason we're evaluating this file is because you requested nix-build
, which will not just ask for the value of a file, but also traverse the attribute set when it sees that it is one. So nix-build
will ask for the value of a
, which will be computed from its thunk. When the computation is complete, the memory that held the thunk is assigned the actual value, type = tInt
, value.integer = 2
.
A recursive attribute set
Nix has a special syntax that combines the functionality of attribute set construction syntax ({ }
) and let
-binding syntax. This is avoids some repetition when you're constructing attribute sets with some shared values.
For example
let b = 1 + 1;
in { b = b; a = b + 5; }
can be expressed as
rec { b = 1 + 1; a = b + 5; }
Evaluation works in a similar manner.
At first the evaluator returns a representation of the attribute set with all thunks, but this time the thunks reference an new environment that includes all the attributes, on top of the existing lexical scope.
Note that all these representations can be constructed while performing a minimal amount of work.
nix-build
traverses attrsets in alphabetic order, so it will evaluate a
first. It's a thunk that references the a +
syntax node and an environment with b
in it. Evaluating this requires evaluating the b
syntax node (an ExprVar
), which references the environment, where we find the 1 + 1
thunk, which is changed to a tInt
of 2
as before.
As you can see, this process of creating thunks but only evaluating them when needed is indeed lazy and allows us to have various language constructs with their own scoping rules.
Haskell implementations usually follow a similar pattern, but may compile the code rather than interpret a syntax tree, and resolve all variable references to constant memory offsets completely. Nix tries to do this to some degree, but it must be able to fall back on strings because of the inadvisable with
keyword that makes the scope dynamic.