How can the hasNext methods of Scanner "not advance past any input"?

Question

In the hasNext and hasNextXxx methods of java.util.Scanner it reads:

Returns true if this scanner has another token in its input. This method may block while waiting for input to scan. The scanner does not advance past any input.

How can the scanner not advance through the input stream while it clearly has to check if a next token is present in the input (as Readable or InputStream)?

I guess it's the same thing, but I hope that `peek` doesn't have the same kind of documentation mistakes (or at least badly written statements). — Maarten Bodewes, Nov 14 '19 at 17:32

Maarten Bodewes · Answer 1 · 2019-11-14T18:17:17.683

3

It does advance through the input stream. It just doesn't advance past the next input, from the Scanner and Scanner user's point of view. So arguably this piece of documentation is badly worded.

This can be easily tested by putting just a single token in a stream, and then calling hasNext for that token. In that case, you'll find that the stream is exhausted after calling hasNext.

In other words, the Scanner buffers the peeked characters internally, including any delimiter characters. Subsequent calls will then start out from the buffered characters. Scanner doesn't rely on any buffering provided by the underlying stream.

This text is at least present until Java 12.

Example code:

Reader stringReader = new StringReader("Single line");

Scanner firstScanner = new Scanner(stringReader);
// prints out true because there is a first line
System.out.println(firstScanner.hasNextLine());

Scanner secondScanner = new Scanner(stringReader);
// prints out false because the scanner has advanced the reader
// ... through the character stream generated from the string
System.out.println(secondScanner.hasNextLine());

// prints out true because the first scanner is buffering the input
System.out.println(firstScanner.hasNextLine());

It also contains other documentation issues with hasNext, for instance in the documentation of the Scanner class itself it reads:

The next() and hasNext() methods and their companion methods (such as nextInt() and hasNextInt()) first skip any input that matches the delimiter pattern, and then attempt to return the next token.

obviously the last part is not true for hasNext(). This looks like hasNext has been shoe-horned into an existing sentence.

edited Nov 14 '19 at 18:17

answered Nov 14 '19 at 17:23

Maarten Bodewes

90,524
13
150
263

1

how come you make a question and answer it ? – Eboubaker Nov 14 '19 at 17:28
1

@ZOLDIK I first searched on SO, then didn't find the information. So I decided to test it and write a question and answer. Next step is to file a bug report against the documentation, I guess. Of course, feel free to write a better answer, I'll be happy to accept it :) – Maarten Bodewes Nov 14 '19 at 17:29
2

@ZOLDIK It's encouraged: https://stackoverflow.com/help/self-answer – Sotirios Delimanolis Nov 14 '19 at 17:33
1

I don't think this is true; e.g. `Scanner sc = new Scanner("Hello world"); sc.next(); sc.hasNext(); sc.findInLine(".")` the last call returns `" "`, not `"w"`, so the `hasNext` call doesn't advance the input to the start of the next token. – kaya3 Nov 14 '19 at 17:46
I think you are entirely missing the "input stream" part of the question and answer. A string is not a stream and doesn't contain any state. If you have any recommendations for changes in the question or answer after re-reading it I'll be happy to take them into account. – Maarten Bodewes Nov 14 '19 at 17:51
The exact same behaviour occurs if you put the string into a ByteArrayInputStream first. – kaya3 Nov 14 '19 at 17:53
You are only calling methods on `sc` and I clearly said that there is no advance in the input / input tokens from the Scanner's point of view. But you've got a point about *what* is being buffered maybe - if it includes delimiter characters, for instance - I'll investigate. – Maarten Bodewes Nov 14 '19 at 17:57

kaya3 · Accepted Answer · 2022-01-22T15:24:48.523

There are two notions of "advancing through input" being conflated here.

The first notion is reading more bytes from the InputStream, which of course the Scanner has to do in order to test whether it has a next token. The Scanner does this by buffering the new bytes read from the InputStream, so that those bytes can later be read from the Scanner when necessary. We could call this "advancing through the input stream".

The second notion is advancing the position of the next byte to be consumed from the buffer. This happens when you call methods like next which consume tokens, to mark that those bytes have been "used" and shouldn't be read again; calling next advances the position in the buffer. We could call this "advancing through the input in the Scanner's buffer".

The documentation's statement "The scanner does not advance past any input" is the second notion; calling hasNext doesn't advance through the Scanner's buffer, it merely reads more bytes from the input stream into the buffer. To verify this, you can try creating an input stream, calling hasNext, and then reading the next character (rather than the next token) using nextInLine:

> InputStream is = new ByteArrayInputStream("hello world".getBytes());
> Scanner sc = new Scanner(is);
> sc.next()
"hello" (String)
> sc.hasNext()
true (boolean)
> sc.nextInLine(".")
" " (String)

So calling hasNext not only doesn't consume the next token, but it also doesn't consume the delimiter before the next token. The Scanner has not "advance[d] past any input" in this sense.

How can the hasNext methods of Scanner "not advance past any input"?

2 Answers2