In Scheme, how can I understand "(define (f x . y) (cons x y))"

Question

I'm new to Scheme, here I get in trouble with dotted list, here is an example:

(define (f x . y) (cons x y))

When I enter: (f 1 2 3) the result is '(1 2 3). Yes, it returns a list, and at this time

x => 1 and y => '(2 3).

My question is how could the interpreter know this when f takes unfixed length of args?

score 5 · Accepted Answer · edited May 23 '17 at 12:27

I'm new to Scheme, here I get in trouble with dotted list, here is an example:
(define (f x . y) (cons x y))
When I enter: (f 1 2 3) the result is '(1 2 3). Yes, it returns a list, and at this time
x => 1 and y => '(2 3).
My question is how could the interpreter know this when f takes unfixed length of args?

It's not quite right to say that f takes an unfixed, or arbitrary, number of arguments. It doesn't. It requires at least one argument. If you try calling (f), you'll get an error. The key to understanding what's going on here may be understanding the dotted pair notation in Lisps. It's covered in another Stack Overflow question, Dot notation in scheme, but the short version is that every list is built from pairs. A single pair can be written as (car . cdr). E.g., (cons 1 2) => (1 . 2). A list like (1 2 3) is actually a pair whose cdr is another pair whose cdr is another pair, etc.: (1 . (2 . (3 . ()))). Because that's awkward to read, we typically remove all but the last dot when we write a chain of pairs, and if the last cdr is (), we omit the dot and (), too. So

(1 . (2 . (3 . 4))  === (1 2 3 . 4)
(1 . (2 . (3 . ())) === (1 2 3)

Noete that while the default printer won't print it this way, the list (1 2 3 4) can also be written in these ways:

(1 . (2 3 4)) === (1 2 . (3 4)) === (1 2 3 . (4))

While it won't print that way, you can write that and the system will understand it. E.g.,

> '(1 2 . (3 4))
(1 2 3 4)

That may be the key to understanding the lambda-list notation. When we write a lambda list for a function, it represents a function, and that lambda list gets destructured against the arguments. So, if we have following lambda lists and use them to destructure the argument list (1 2 3), we get the resulting bindings:

lambda-list    x             y      z
-------------------------------------
(x y z)        1             2      3
(x y . z)      1             2    (3)
(x . y)        1         (2 3)    n/a
x              (1 2 3)     n/a    n/a

That last case might be surprising, but you can test that all of these actually work as expected:

((lambda (x y z)   (list x y z)) '(1 2 3)) => (1 2 3)
((lambda (x y . z) (list x y z)) '(1 2 3)) => (1 2 (3))
((lambda (x . y)   (list x y))   '(1 2 3)) => (1 (2 3))
((lambda x         (list x))     '(1 2 3)) => ((1 2 3))

That last one is kind of neat, because it means that if it weren't part of the language already, you could do:

(define (list . elements) elements)

score 2 · Answer 2 · edited Jun 20 '20 at 09:12

In scheme, a dot before a formal parameter indicates that the parameters' value inside the procedure will be a list, with all the remaining parameters. So as you correctly show in your post, y inside the procedure is a list, '(2 3). In f x is treated as a normal parameter.

(define (f x . y)
 (display "y: ")
 (display y) (newline)) 

(f 1 2 'this 'is 'a 'list)
y: (2 this is a list)

Here is what Guile reference manual has to say about this:

(variable1 … variablen . variablen+1)

If a space-delimited period precedes the last variable, then the procedure takes n or > more variables where n is the number of formal arguments before the period. There must be at least one argument before the period. The first n actual arguments will be stored into the newly allocated locations for the first n formal arguments and the sequence of the remaining actual arguments is converted into a list and the stored into the location for the last formal argument. If there are exactly n actual arguments, the empty list is stored into the location of the last formal argument.

It's more proper to call it a "dot" rather than a "period". Cons cells are sometimes called "dotted pairs". — C. K. Young, Jun 18 '14 at 13:34

score 0 · Answer 3 · answered Jun 18 '14 at 13:14

0

The interpreter knows that f has an argument named x followed by the argument list y. So when it encounters the call to f, it puts the first argument into x and the remaining ones into y.

answered Jun 18 '14 at 13:14

sepp2k

363,768
54
674
675

Does the interpreter also know that the args `1 2 3` would be parsed into a list `'(1 2 3)`? – kedebug Jun 18 '14 at 13:27
@kedebug The interpreter knows that `(cons 1 '(2 3))` evaluates to `'(1 2 3)` if that's what you're asking. – sepp2k Jun 18 '14 at 13:32
I'm sorry, I meant that `f` takes `1 2 3`, it could be three args, but it parsed into a list `'(1 2 3)`. Hard for me to understand this. – kedebug Jun 18 '14 at 13:41
@kedebug it is not parsed into `'(1 2 3)`. You create that list in the body of your procedure with `(cons 1 '(2 3))` – Rptx Jun 18 '14 at 13:47

score 0 · Answer 4 · answered Jun 18 '14 at 13:15

0

The (f x . y) syntax means that the first argument to define is an "improper" list (is is equivalent to what you would get from (cons 'f (cons 'x 'y)) rather than from (cons 'f (cons 'x (cons 'y nil))). So when the function is called, each thing is bound in turn. First, x is bound to 1, then the rest of the list is bound to the tail of the arument list. Since that, at this point, is a symbol, y is bound to '(2 3).

answered Jun 18 '14 at 13:15

Vatine

20,782
4
54
70

You can use dot notation to express a proper list ('a . '()) is a proper list. A cons cell is comprised of two pointers. So when used in an argument list like so it's a way of binding the contents of the `cdr` pointer to a value, rather than each of the `car`'s contents in turn. – WorBlux Jun 20 '14 at 02:51

In Scheme, how can I understand "(define (f x . y) (cons x y))"

4 Answers4