Emacs Lisp shared structure and shared links

Question

Consider the cons x1:

(setq x1 '(a . (b c))) => (a b c)

or in list notation:

(setq x1 '(a b c)) => (a b c)

and the cons x2, built on x1:

(setq x2 (cons 'A (cdr x1))) => (A b c)

cons help (in Emacs) says that the function creates a new cons, gives it the arguments, 'A and (cdr x1), as components and returns it. There is nothing in it suggesting that the life of the newly returned cons will be linked to that of its generating components.

Anyway if one modifies the copy, x2, also the originating cons, (a . (b c)) gets modified:

(setcar (cdr x2) 'B) => B
x2 => (A B c)  ; as from the assignment 
x1 => (a B c)  ; x1 and x2 are linked

Other function examples can show the link between x1 and x2.

(setq x1 '(a . (b c))) => (a b c)
(setq x2 (cons 'A (cdr x1))) => (A b c)
(nreverse x2) => (c b A)
x1 => (a b A)

I took this example from the documentation of setcar in the Emacs Lisp Reference Manual, which states that the "cons cell is part of the shared structure" and the cdr of x1 and x2 is referred to as a "shared link" and x1, x2 are graphically shown like (slightly adapted):

x1:
   --------------       --------------       --------------
  | car   | cdr  |     | car   | cdr  |     | car   | cdr  |
  |   a   |   o------->|   b   |   o------->|   c   |  nil |
  |       |      |  -->|       |      |     |       |      |
   --------------  |    --------------       --------------
                   |
  x2:              |
   --------------  |
  | car   | cdr  | |
  |   A   |   o----
  |       |      |
   --------------

This is something reminiscent of C pointers, in that the cdr of x2 is not a copy, but "points" to the cdr of x1. Clear, but I wonder when this situation practically arise, that is, how can I know whether (an element of) a cons points to another one or is a self living copy? More generally what (where) is a formal definition of shared structure and shared links?

In the Emacs Lisp Reference Manual there is no explicit mention of them. In fact a search for "shared" or "link" in its index returns (excluding file/web links) only the indirect reference to "shared structure, read syntax", dealing with their representation, not on what they are.
Curiously searching the PDF for "shared" lands, as first occurrence, to the section "Read Syntax for Circular Objects", starting with "To represent shared or circular structures...". Unfortunately there is no prior mention of the words shared and circular (structures)! The next occurrence is the mentioned setcar documentation.

So it seems that there are implicit pointers in Lisp/Elisp, but no one is willing to tell about them.))

Introductory Lisp books should explain how lists are made of cons cells. A cons cell is basically a data structure with two pointers (minus some optimizations). That's nothing new - it was that way in the very first Lisp, over fifty years ago. Making lists and other data structure out of cons cells and the functions which use them, is at the core of Lisp. Looks like you are at a good start. — Rainer Joswig, Feb 09 '14 at 15:32
@RainerJoswig: Comment split: **Part1**. Please, have a look here: [psg.com/~dlamkins](http://psg.com/~dlamkins/sl/chapter11.html). "a destructive function such as NREVERSE *sometimes* modifies its argument in such a way that the changed argument is identical to the function result." Sometimes? When? — antonio, Feb 09 '14 at 17:00
@RainerJoswig: Comment split: **Part2** "you should not depend upon DELETE's side-effects. [...] But some macros, for example PUSH and POP, take a place as an argument and *arrange to update* the place with the correct value." Many thanks to the author, but the documentation should state this with the same clarity. There is not hint in Emacs docs to `delete` being unreliable and `push`/`pop` succeeding in modifying the involved objects. Besides what is the point in having destructive function with side-effects, if you cannot depend upon them? are they faster or less memory consuming? — antonio, Feb 09 '14 at 17:01
the Common Lisp standard allows implementors some freedom how they implement the various functions and macros. You can reverse a list in place -> then it is identical. You can reverse a list with new cons cells. The main effect is the same: a reversed list. The side effect depends on what the implementation actually does to compute the reversed list. It is written with clarity, where implementations have a degree of freedom. DELETE is not unreliable. The standard describes exactly what it does and that you should not take advantage on side effects. — Rainer Joswig, Feb 09 '14 at 17:13
To say that "`x1` and `x2` are linked" is a conceptual mistake. Instead, it is better to think of `x1` and `x2` as pointing to objects that point to the same object. You can see that there is no link between `x1` and `x2` by doing something like `(setq x2 'zonk)` and observing that there is no change in the other value. It is important to understand the difference between changing `x2`, and changing something that `x2` is pointing at: your code is doing the latter. — gsg, Feb 09 '14 at 17:42
Is there a question? But to your final assertion, that behavior of lists is explained in introductory Lisp texts that I have read when they describe list processing in any kind of detail. — lurker, Feb 09 '14 at 18:53
@mbratch: the question is how can I know whether (an element of) a cons points to another one or is a self living copy? More generally what (where) is a formal definition of shared structure and shared links? And it was partly answered by sds. — antonio, Feb 09 '14 at 21:15
When you use `(cdr my-list)` it refers directly to the "tail" of `my-list`. It doesn't create a copy of the `cdr`. It *is* the `cdr`. So, `(setf a (cons 'x (cdr my-list)))` says to create a `cons` cell containing `x` and pointing to `(cdr my-list)`. — lurker, Feb 09 '14 at 21:35
@gsg: Actually I read this term in the `setcar` documentation, to me it is easier to reason in term of pointer logic. I suppose your example is the consequence of the fact that in c style a pointer will always point to the same type of object, while a Lisp "pseudo-pointer" variable (symbol) can point to whatever type of object and even become a non-pointer variable. — antonio, Feb 09 '14 at 21:36

sds · Answer 1 · 2014-02-10T06:32:31.610

Your friends are the functions eq and equal.

eq compares for physical identity while equal checks whether the objects "look alike".

In your case:

(defvar a (list 1 2 3))
(defvar b (cons 1 (cdr a)))
(equal a b)
==> t
(eq a b)
==> nil
(eq (cdr a) (cdr b))
==> t

EDIT: note that list is equivalent to a few cons calls:

(list x y) == (cons x (cons y nil))

and whenever you call cons or list, you get something which is not eq to anything else.

Continuing the example above:

(defvar c (list 4 (cdr a)))
(defvar d (list 4 (cdr b)))
(equal c d)
==> t
(eq c d)
==> nil
(eq (cdr c) (cdr d))
==> nil
(eq (cadr c) (cadr d))
==> t
(eq (cadr c) (cdr a))
==> t
(eq (cadr d) (cdr b))
==> t

PS. It is useful to realize that (E x y) ==> (E (F x) (F y)) where E is an equality predicate (eq or equal) and F is an accessor (e.g., car or cdr).

PPS. The inverse is true for equal:

(and (equal (car x) (car y)) 
     (equal (cdr x) (cdr y)))

implies (in fact, is equivalent to) (equal x y); but not for eq.

Thanks this gives a very general way of testing the links. I still miss the rationale. For example, given again `(setq x1 '(a b c))`, with the `cons` function I obtain `(setq a1 (cons 'A (cdr x1)))`, `(setq a2 (cons 'A (cdr x1)))`, `(eq (cdr a1) (cdr a2)) => t`, but using the `list` function, I get `(setq a1 (list 'A (cdr x1)))`, `(setq a2 (list 'A (cdr x1)))`, `(eq (cdr a1) (cdr a2)) => nil`. In both cases `a1`/`a2` are `type-of` cons, they share the *same* cdr, resp. `(b c)` and `((b c))`; but in the first case they are `eq` in the second they are not `eq`. — antonio, Feb 09 '14 at 21:08
@antonio: you are missing the basic difference between `list` and `cons`. see edit — sds, Feb 09 '14 at 21:31

score 0 · Answer 2 · answered Feb 09 '14 at 19:28

0

To add to @sds's answer, since he did not mention it explicitly, and you asked about this:

See the Elisp manual, node Modifying Lists and its subnodes. The example you ask about is mentioned explicitly in node Setcar:

 ;; Create two lists that are partly shared.
 (setq x1 '(a b c))
      => (a b c)
 (setq x2 (cons 'z (cdr x1)))
      => (z b c)

Yes, your question is not specifically about setcar and other list structure-modifying functions. But the presentation in the Elisp manual about cons cells provides the answer you are looking for in general, in addition to the comments and answers given here about how arguments are passed in Lisp.

answered Feb 09 '14 at 19:28

Drew

29,895
7
74
104

I wrote that my example is taken from `setcar` documentation and, as it is the only reference I found to such a relevant subject, I was asking where to find more insights and also if there is some conceptual way to infer when an object is going to be linked to another. – antonio Feb 09 '14 at 21:09
antonio: FWIW, "pointer" is indexed in the manual, and takes you to `C-h i g` `(elisp) Cons Cell Type` `RET`. I realise that's somewhat tangential to your actual question, and you've almost certainly covered it by now; but make sure you read through that section if you've not already done so. – phils Feb 09 '14 at 23:51

Joshua Taylor · Answer 3 · 2014-02-10T15:21:28.880

There are some other answers here that explain this in more detail, but it might be helpful to aim for a very minimal analogy, too. A cons cell is a very small container: it holds two elements, and has accessors car and cdr for getting those elements back out.

In most object oriented programming languages, there's no automatic copying of objects when you put them into a container. E.g., in Java, if you have:

Object a = new Object();
Object b = new Object();

Object[] cons1 = new Object[] { a, b };
Object[] cons2 = new Object[] { a, b };

You should expect

cons1 == cons2

to be false, but

( cons1[0] == cons2[1] ) && ( cons1[1] == cons2[1] )

to be true. The container objects are different, but the objects that they contain are the same.

It's just a convention that lists are built from cons cells using where a list is either the empty list (nil), or a cons cell whose car is the first element of the list and whose cdr is the rest of the list.

This is something reminiscent of C pointers, in that the cdr of x2 is not a copy, but "points" to the cdr of x1. Clear, but I wonder when this situation practically arise, that is, how can I know whether (an element of) a cons points to another one or is a self living copy? More generally what (where) is a formal definition of shared structure and shared links?

It's not just reminiscent, it's pretty much the same thing. There are object in memory, and you can get ahold of them. Sometimes more than one thing may have a reference to the same object in memory. You can test for equality of cons cells with eq, but that's not a general answer about shared structure, because you don't have a way to know who else has a reference to an object. In general, you'll follow a rule along the lines of "don't modify structures you didn't create, unless you explicitly mention it in the documentation, and even then, return the important value." Thus reverse doesn't modify its argument, but nreverse is allowed to (but still returns the reversed list; nreverse doesn't guarantee that the list is reversed in-place). If you're building up a list that's local to your function, it's fine to use nreverse to reverse it, because you know that no one else has a reference to it.

Emacs Lisp shared structure and shared links

3 Answers3