How to implement blocking iterator over stdin?

Question

I need to implement a long-running program that receives messages via stdin. The protocol defines that messages are in form of length indicator (for simplicity 1 byte integer) and then string of a length represented by length indicator. Messages are NOT separated by any whitespace. The program is expected to consume all messages from stdin and wait for another messages.

How do I implement such waiting on stdin?

I implemented the iterator in a way that it tries to read from stdin and repeats in case of error. It works, but it is very inefficient. I would like the iterator to read the message when new data comes.

My implementation is using read_exact:

use std::io::{Read, stdin, Error as IOError, ErrorKind};

pub struct In<R>(R) where R: Read;

pub trait InStream{
    fn read_one(&mut self) -> Result<String, IOError>;
}

impl <R>In<R> where R: Read{
    pub fn new(stdin: R) -> In<R> {
        In(stdin)
    }
}

impl <R>InStream for In<R> where R: Read{
    /// Read one message from stdin and return it as string
    fn read_one(&mut self) -> Result<String, IOError>{

        const length_indicator: usize = 1;
        let stdin = &mut self.0;

        let mut size: [u8;length_indicator] = [0; length_indicator];
        stdin.read_exact(&mut size)?;
        let size = u8::from_be_bytes(size) as usize;

        let mut buffer = vec![0u8; size];
        let _bytes_read = stdin.read_exact(&mut buffer);
        String::from_utf8(buffer).map_err(|_| IOError::new(ErrorKind::InvalidData, "not utf8"))
    }
}
impl <R>Iterator for In<R> where R:Read{
    type Item = String;
    fn next(&mut self) -> Option<String>{
        self.read_one()
            .ok()
    }
}

fn main(){
    let mut in_stream = In::new(stdin());
    loop{
        match in_stream.next(){
            Some(x) => println!("x: {:?}", x),
            None => (),
        }
    }
}

I went trough Read and BufReader documentation, but none method seems to solve my problem as read doc contains following text:

This function does not provide any guarantees about whether it blocks waiting for data, but if an object needs to block for a read and cannot, it will typically signal this via an Err return value.

How do I implement waiting for data on stdin?

===

Edit: minimum use-case that does not block and loops giving UnexpectedEof error instead of waiting for data:

use std::io::{Read, stdin};
fn main(){
    let mut stdin = stdin();
    let mut stdin_handle = stdin.lock();
    loop{
        let mut buffer = vec![0u8; 4];
        let res = stdin_handle.read_exact(&mut buffer);
        println!("res: {:?}", res);
        println!("buffer: {:?}", buffer);
    }

I run it on OSX by cargo run < in where in is named pipe. I fill the pipe by echo -n "1234" > in.

It waits for the first input and then it loops.

res: Ok(())
buffer: [49, 50, 51, 52]
res: Err(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })
buffer: [0, 0, 0, 0]
res: Err(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })
buffer: [0, 0, 0, 0]
res: Err(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })
buffer: [0, 0, 0, 0]
res: Err(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })
buffer: [0, 0, 0, 0]
res: Err(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })
...

I would like the program to wait until there is sufficient data to fill the buffer.

What do you mean by: `I would like the iterator to read the message when new data comes`? You want it to read ahead and have the data ready when asked for it? — Netwave, Oct 20 '21 at 07:35
The quote you give: "This function does not provide any guarantees [...]" is from the trait `Read`, which is very generalistic. *Not sure about it*, but this warning might not apply to the specific case of `Stdin`, because in this case, "waiting for data" is possible. — yolenoyer, Oct 20 '21 at 07:42
@netwave I want the call `in_stream.next()` wait block until it is able to return data. Did I make it clear? — user3589900, Oct 20 '21 at 07:52
`StdinLock` implements `BufRead`, so it's trivial to read in a loop. The implementation there *does* block, so you don't have to do anything — Svetlin Zarev, Oct 20 '21 at 07:58
@SvetlinZarev +1, I dont get how could it not be blocking to read. — Netwave, Oct 20 '21 at 08:06
Thanks @SvetlinZarev! I changed all `Read` to `BufRead`, and I am passing `stdin.lock()` insted of `let mut in_stream = In::new(stdin());` but the loop keeps running, so I suspect that `read_exact` is not blocking. What I am doing wrong? It looks I did not understand you well... — user3589900, Oct 20 '21 at 08:24
Reading from stdin using `read_exact()` does block if not enough data is available, unless an EOF is sent to stdin before the buffer can be filled, in which case it returns an error. It doesn't matter whether you read from stdin unbuffered or buffered – it should block in either case. What evidence do you see that it doesn't block? Can you describe how we could reproduce the problem? And, just in case, what platform are you on? — Sven Marnach, Oct 20 '21 at 09:16
Thank for your advices. I amended the original question with reproducer and expected results. My platform is OSX. Maybe I am using named pipe in a wrong way? — user3589900, Oct 20 '21 at 09:55
Thank you, @user4815162342 you solved my issue! Spawning another program writing to pipe `tail -f /dev/null > in` works. — user3589900, Oct 20 '21 at 11:10

user4815162342 · Accepted Answer · 2021-10-20T14:06:40.777

As others explained, the docs on Read are written very generally and don't apply to standard input, which is blocking. In other words, your code with the buffering added is fine.

The problem is how you use the pipe. For example, if you run mkfifo foo; cat <foo in one shell, and echo -n bla >foo in another, you'll see that the cat in the first shell will display foo and exit. That closing the last writer of the pipe sends EOF to the reader, rendering your program's stdin useless.

You can work around the issue by starting another program in the background that opens the pipe in write mode and never exits, for example tail -f /dev/null >pipe-filename. Then echo -n bla >foo will be observed by your program, but won't cause its stdin to close. The "holding" of the write end of the pipe could probably also be achieved from Rust as well.

How to implement blocking iterator over stdin?

1 Answers1