unread-char behaviour deviating from spec?

Question

On the Common Lisp HyperSpec page for unread-char - see here - it says both of the following things:

"unread-char is intended to be an efficient mechanism for allowing the Lisp reader and other parsers to perform one-character lookahead in input-stream."
"It is an error to invoke unread-char twice consecutively on the same stream without an intervening call to read-char (or some other input operation which implicitly reads characters) on that stream."

I'm investigating how to add support for multiple-character lookahead for CL streams for a parser I'm planning to write, and just to confirm the above, I ran the following code:

(defun unread-char-test (data)
  (with-input-from-string (stream data)
    (let ((stack nil))
      (loop
         for c = (read-char stream nil)
         while c
         do (push c stack))
      (loop
         for c = (pop stack)
         while c
         do (unread-char c stream)))
    (coerce
     (loop
        for c = (read-char stream nil)
        while c
        collect c)
     'string)))

(unread-char-test "hello")
==> "hello"

It doesn't throw an error (on SBCL or CCL, I haven't tested it on other implementations yet) but I don't see how there can possibly be any read operations (implicit or explicit) taking place on the stream between the consecutive calls to unread-char.

This behaviour is good news for multiple-character lookahead, as long as it is consistent, but why isn't an error being thrown?

I assume the developers didn't bother signalling an error when unreading multiple characters on a string input stream when the string is in memory and unreading is essentially just decrementing the current index. — jkiiski, Jul 29 '17 at 13:59

score 5 · Answer 1 · answered Jul 29 '17 at 16:00

In response to user jkiiski's comment I did some more digging. I defined a function similar to the above but that takes the stream as an argument (for easier reuse):

(defun unread-char-test (stream)
  (let ((stack nil))
    (loop
       for c = (read-char stream nil)
       while c
       do (push c stack))
    (loop
       for c = (pop stack)
       while c
       do (unread-char c stream)))
  (coerce
   (loop
      for c = (read-char stream nil)
      while c
      collect c)
   'string))

I then ran the following in a second REPL:

(defun create-server (port)
  (usocket:with-socket-listener (listener "127.0.0.1" port)
    (usocket:with-server-socket (connection (usocket:socket-accept listener))
      (let ((stream (usocket:socket-stream connection)))
        (print "hello" stream)))))

(create-server 4000)

And the following in the first REPL:

(defun create-client (port)
  (usocket:with-client-socket (connection stream "127.0.0.1" port)
    (unread-char-test stream)))

(create-client 4000)

And it did throw the error I expected:

Two UNREAD-CHARs without intervening READ-CHAR on #<BASIC-TCP-STREAM ISO-8859-1 (SOCKET/4) #x302001813E2D>
   [Condition of type SIMPLE-ERROR]

This suggests that jkiiski's assumption is correct. The original behaviour was also observed when the input was read from a text file, like so:

(with-open-file (stream "test.txt" :direction :output)
  (princ "hello" stream))

(with-open-file (stream "test.txt")
  (unread-char-test stream)))
==> "hello"

I imagine that, when dealing with local file I/O, the implementation reads large chunks of a file into memory, and then read-char reads from the buffer. If correct, this also supports the assumption that the error described in the specification is not thrown by typical implementations when unreading from a stream whose contents are in-memory.

unread-char behaviour deviating from spec?

1 Answers1