at_end_of_stream on stdin in Prolog

Question

I ask SICStus Prolog ?- at_end_of_stream. on the top-level and immediately I get: no.

GNU Prolog and Scryer Prolog do the same.

Traella Prolog and SWI-Prolog, however, choose to wait for input before answering, and rumor has it that behaving like this has been quite common among Prolog systems—especially in the past.

If I look at the ISO-Prolog standard, that behavior becomes spurious:

8.11.8.5 Bootstrapped built-in predicates
The built-in predicates at_end_of_stream/0 and at_the_of_stream/1 examine the single stream-property end_of_stream/1.

A goal at_end_of_stream is true iff the current input has a stream position end-of-stream or past-end-of-stream (7.10.2.9, 7.10.2.13).

A goal at_end_of_stream(S_or_a) is true iff the stream associated with stream or alias S_or_a has a stream position end-of-stream.
at_end_of_stream :-
  current_input(S),
  stream_property(S, end_of_stream(E)),
  !,
  (E = at ; E = past).
[The the sake of brevity, I omitted the code for at_end_of_stream/1.]

So it appears that the standard is quite clearly on the side of SICStus Prolog and GNU Prolog.

So my question boils down to this:

Is this "waiting" behavior simply non-conformance, a kind of anachronism justified on the basis of practicality / compatibility—or is there more to it?

notoria · Accepted Answer · 2022-09-07T17:21:05.817

Handling end of stream is hard, on the top-level it's even harder.

A very simple and basic top-level is like sh (dash). Prolog top-level is a bit similar, it differs by having auto-completion/choice (like Scryer Prolog, SWI-Prolog), not accepting input with ctrl-D but it could be like sh.

A query to test and better understand top-level is ?- get_char(C).%a (where C binds to %, source) and ?- get_char(C). (when submitting with enter, C binds to \n, if submitting with ctrl-D then waits or binds C to end_of_file).

Why those query? The top-level can be model as read_term(user_input, Goal, []), call(Goal):

Case get_char(C).%a submitted with enter: The query is parsed and the user_input is %a\n thus C binds to % and if get_char(C) is replaced by at_end_of_stream then at_end_of_stream/0 fails.
Case get_char(C). submitted with enter: The query is parsed and the user_input is \n thus C binds to \n and if get_char(C) is replaced by at_end_of_stream then at_end_of_stream/0 fails.
Case get_char(C). submitted with ctrl-D: The query is parsed and the user_input is empty. Now the stream position of user_input should be end-of-stream or past-end-of-stream if read_term/3 is executed as procedurally specified in 8.14.1.1:
- If user_input doesn't have an end then with the note of 7.10.2.9 the stream postion can't be end-of-stream or past-end-of-stream. Is the stream reset and has end_of_stream property not? What happens if the stream doesn't have the property eof_action(reset)? In any case get_char/1 waits and if get_char(C) is replaced by at_end_of_stream then at_end_of_stream/0 fails (there is an opportunity for at_end_of_stream/0 to wait here).
- Else user_input has a size but it's unknown (or variable but still unknown) then:
  - If stream position is end-of-stream then C binds to end_of_file and if get_char(C) is replaced by at_end_of_stream then at_end_of_stream/0 succeeds.
  - Else stream position is past-end-of-stream then get_char/1 waits and if get_char(C) is replaced by at_end_of_stream then at_end_of_stream/0 succeeds.

Sadly there is a bit of twisting at the end.

I might have missed/misread something if not then this undefinedness allows some optimization/simplification of the implementation, explains the difference between engine and also the actual top-level is more complicated that read_term(user_input, Goal, []), call(Goal) (even if it's the ideal one).

When it comes to at_end_of_stream/0, it doesn't seem like there is a reason to wait.

Digression

If at_end_of_stream/0 is implemented with peek_char/1 without executing eof_action then that might explain the wait.

On other operating system, it may be different. I use and test on Linux.

On GNU Prolog (version 1.5.0), there is the issue that a stream with property eof_action(reset) is never at end of stream. This explains why at_end_of_stream/0 fails.

$ echo -n | gprolog --init-goal "(at_end_of_stream, write(at), nl, halt ; peek_char(C), writeq(not(C)), nl, halt)"
not(end_of_file)
$ echo | gprolog --init-goal "(at_end_of_stream, writeq(at), nl, halt ; peek_char(C), writeq(not(C)), nl, halt)"
not('\n')

at_end_of_stream/0 fails even when the stdin is empty.

On Trealla Prolog (version v2.1.11):

$ echo -n | ./tpl -g "(at_end_of_stream, writeq(at), nl, halt ; peek_char(C), writeq(not(C)), nl, halt)"
at
$ echo | ./tpl -g "(at_end_of_stream, writeq(at), nl, halt ; peek_char(C), writeq(not(C)), nl, halt)"
not('\n')
$ echo -n | ./tpl -g "(stream_property(S, alias(user_input)), stream_property(S, end_of_stream(Eos)), at_end_of_stream(S), writeq(eos(Eos)), halt ; halt)"
eos(not)

The property end_of_stream doesn't agree with at_end_of_stream/1 (permuting stream_property/2 and at_end_of_stream/1 doesn't change the result).

On Scryer Prolog (on master (6b8e6204957bfc3136ea39ec659d30627775260d) or rebis-dev (c1945caf11c0d202f4121de446f1694854dcba47)):

$ echo -n | ./target/release/scryer-prolog -g "(at_end_of_stream, write(at), nl, halt ; peek_char(C), writeq(not(C)), nl, halt)"
not('\x0\')
$ echo | ./target/release/scryer-prolog -g "(at_end_of_stream, write(at), nl, halt ; peek_char(C), writeq(not(C)), nl, halt)"
not('\n')

at_end_of_stream/0 fails even when the stdin is empty.

Anything happened since [#270](http://www.complang.tuwien.ac.at/ulrich/iso-prolog/conformity_testing#270)? — false, Sep 07 '22 at 17:09
With GNU Prolog version 1.5.0, `C = '%'` on #270 on the top-level. — notoria, Sep 07 '22 at 17:26
No change. #270 does not test the top level loop which is out of scope, see 1 Scope, NOTE f. Thus, `?- read(X),X.` is used instead. And there everything is the same, even in 1.5.1 — false, Sep 07 '22 at 17:39

notoria · Answer 2 · 2022-09-07T20:01:31.917

1

How to make the `at_end_of_stream/0`/`stream_property/2` wait?

Using the top-level read_term(user_input, Goal, []), call(Goal), the query ?- get_char(C), at_end_of_stream. can wait.

The query ?- at_end_of_stream. doesn't wait because of read_term/3 since it needs to peek to determine the end token and it can update the stream position.

If user_input doesn't have an end then with the note of 7.10.2.9, at_end_of_stream/0 fails.
Else user_input has a size but it's unknown (or variable but still unknown) then at_end_stream/0 waits. The query get_char(C), at_end_of_stream. is submitted with enter. The query is parsed and user_input is \n. Now get_char/1 reads \n, user_input is empty and get_char/1 should update stream position as specified by 8.12.1.1. But get_char/1 can't update user_input and needs not do an update by peeking, it can be left for when the user tries to observe with at_end_of_stream/0. Now at_end_of_stream/0 waits.

Another way to make at_end_of_stream/0 waits is to use the top-level at_end_of_stream. The initial stream position of user_input is unknown (the size of the stream is unknown) thus at_end_of_stream/0 waits.

Small digression: get_char/1 can't update the stream position of user_input unless it reads end_of_file in which case user_input stream position is updated to past-end-of-stream.

edited Sep 07 '22 at 20:01

answered Sep 07 '22 at 16:27

notoria

2,053
1
4
15

1

Note that peeking is a bit cumbersome since you have to use either `peek_char/1/2` or `peek_byte/1/2` depending on `text` or `binary`. – false Sep 07 '22 at 17:11
Removed the implementation detail. But that would be internally without `eof_action` being triggered. – notoria Sep 07 '22 at 20:07
Let's assume you are right when you say "it *can* wait". It clearly needs not wait, so what's the benefit in making it wait in these cases? (As a user I cannot rely on it. As an implementor choosing the "fail immediately" path is easier.) – repeat Sep 09 '22 at 07:34
Indeed using the note of `7.10.2.9`, it's much easier for the implementor but the user can't have an useful information. The wait is because there isn't any input to decide. So the stdin (`user_input`) is like a file, `at_end_of_stream/1` always gives the right result. – notoria Sep 09 '22 at 08:10
Yes, but if a user writing ISO-Prolog code cannot *depend* on `at_end_of_stream/1` behaving like this, what is it good for? Why not use `peek_char/1` / `peek_byte/1` as a portable alternative instead? – repeat Sep 09 '22 at 10:00
But `peek_char/1` can't replace `at_end_of_stream/1` due to `eof_action(reset)`. The wait of `at_end_of_stream/1` also "happens" for a file but the input comes from the hard drive while on stdin the input comes from the user. An example using pipe in [digression](https://stackoverflow.com/a/73543996/14414324) (Trealla in particular), there is no wait because of pipe. This mainly makes stdin behave like a file. – notoria Sep 09 '22 at 10:37

at_end_of_stream on stdin in Prolog

2 Answers2

Handling end of stream is hard, on the top-level it's even harder.

Digression

How to make the at_end_of_stream/0/stream_property/2 wait?

How to make the `at_end_of_stream/0`/`stream_property/2` wait?