SICP's description of pointers

Question

This quote is from SICP that I think is talking about pointers/references in programming languages.

As we have seen, pairs provide a primitive “glue” that we can use to construct compound data objects. Figure 2.2 shows a standard way to visualize a pair—in this case, the pair formed by (cons 1 2). In this representation, which is called box-and-pointer notation, each object is shown as a pointer to a box. The box for a primitive object contains a representation of the object. For example, the box for a number contains a numeral. The box for a pair is actually a double box, the left part containing (a pointer to) the car of the pair and the right part containing the cdr.

In this case the book is talking about pairs (E.G. (cons 1 2)) and how they are represented. However we can also use pairs to construct a list like this:

(cons 1 (cons 2 '()))

Though box-and-pointer-notation is just a notation and useless, I think this looks a lot like a linked list. As I understand it, a linked list is a data structure that contains a value and a pointer to another linked list. Having said that I think cons can be constructed like a linked list. I'm confused by:

The box for a pair is actually a double box, the left part containing (a pointer to) the car of the pair and the right part containing the cdr.

I originally thought the pointer should be on cdr because that would be the next list if we were constructing a list through pairs.

I think this might be a different kind of pointer all together. What exactly does a pointer mean in this case? The only pointer I know is pointers used in c. Does SICP even mention anything about c pointers?

score 7 · Answer 1 · edited Feb 01 '16 at 11:52

I originally thought the pointer should be on cdr because that would be the next list if we were constructing a list through pairs.

A cons cell has two values, a car value and a cdr value. What you put into those is not constrained. You can put anything into it.

You can build different data structures out of cons cells. A singly linked list is just one. It could be a binary tree, an assoc list of cons cells providing access via key and value, a circular data structure, and more.

If we do

(cons
      (cons :foo 10)
      (cons :bar 5))

... then how the reference from the cons cell to its car value is done is mostly hidden from the programmer. Most implementations will underneath have some kind of data structure with pointers from a cons cell to its car and cdr components. There usually will be optimizations for small objects like characters and small integers (fixnums) - those can also be directly stored in the car and cdr, instead of using pointers to character objects.

Summary:

A cons cell has two values: a car and a cdr. Both are fully unconstrained: you can reference any other object/value.

Most of the implementation is hidden. In Lisp, all you get is following interface with the basic functions:

(consp thing) : the predicate returns T if thing is a cons cell.
(car cons-cell) : the car value of a cons cell
(cdr cons-cell) : the cdr value of a cons cell
(cons thing-0 thing-1) : creates a cons cell, thing-0 and thing-1 can be anything.

Lists are made out of cons cells. But there are other data structures which can be made out of cons cells.

"Most of the implementation is hidden." - Will I be able to see those hidden things like pointers reading further through the book? — lightning_missile, Feb 01 '16 at 08:56
@morbidCode The implementation depends on what Lisp implementation you're using. You'll have to read its source/manuals if you want to figure that out. But as a programmer it really doesn't matter to you. Just keep in mind that conses consist of two pointers and datastructures built on them are traversed/manipulated accordingly. — jkiiski, Feb 01 '16 at 09:04
@morbidCode SICP provides a couple of possible implementations of `cons`; none of them is practical but more for illustration purpose of either lambda-calculus and message passing. But the point is: it doesn't matter! Unless you are making your own compiler. — mobiuseng, Feb 01 '16 at 10:51
@mobiuseng well I thought chapter 5 is about compilers. I would have thought at least they would provide an implementation to cons in that section. — lightning_missile, Feb 01 '16 at 11:05
@morbidCode To be honest, I haven't started that chapter yet, so there might be a proper `cons` implementation. — mobiuseng, Feb 01 '16 at 11:27

jkiiski · Accepted Answer · 2016-02-01T06:42:25.307

3

Yes, you are correct that cons cells are used to build linked lists in Lisps.

The left part contains the value of the cons. It's a pointer because the value may be a number, an object, another cons (in case of a tree for example) or anything else.

You are also correct that the right part, cdr, contains a pointer to the next cons in the list.

So a list (54 "foobar" 3) would look like this:

            (car cdr)
             /     \
            54      (car cdr)
                     /     \
                  "foobar"  (car nil)
                             / 
                            3

edited Feb 01 '16 at 06:42

answered Feb 01 '16 at 06:33

jkiiski

8,206
2
28
44

Thanks. But what is the use of the word "pointer" in this case? Is the book asking me to think about pairs having pointers as objects that hold memory addresses like in c or c++? Or this is just purely notation in a diagram and not really related to pairs? – lightning_missile Feb 01 '16 at 06:52
1

@morbidCode As pointers to memory addresses. A cons/list is not stored as a contiguous chunk like an array, but rather as pointers to where the value/next part is. – jkiiski Feb 01 '16 at 07:00
then what does the "box" represent? I'm curious because I don't understand the fact that since pairs' items are actually pointers, why bother to make this box-and-pointer notation thing? – lightning_missile Feb 01 '16 at 07:08
1

@morbidCode The boxes are just a way of drawing the cons cell. The cons is basically an array of two pointers, so the box represents that. The arrows then point to the what the pointer points to. – jkiiski Feb 01 '16 at 07:16
@morbidCode Just to add to previous comment: the whole idea of "drawing a box" around `cons`-cell is to make it "one thing": instead of two pointers (or even objects, at this point this is irrelevant) now we have *one pair*. PS: And `car` also contains a pointer - `cons`-cells are more general than just single-linked lists. – mobiuseng Feb 01 '16 at 08:05

SICP's description of pointers

2 Answers2