6

I am successfully reading from a pipe from another thread, and printing the output (in an ncurses window as it happens).

I need to do this one character at a time, for various reasons, and I'm using a select() on the FD for the read end of the pipe, along with a few other FDs (like stdin).

My idea is to attempt to read from the pipe only when it is imminently ready to be read, in preference to handling any input. This seems to be working - at least to start. select() sets up the fd_set and if FD_ISSET I do my read() of 1 byte from the FD. But select() says yes one time too many, and the read() blocks.

So my question is this - why would select() report that a fd is ready for reading, if a subsequent read() blocks?

(approximately) This same code worked fine when the other end of the pipe was connected to a forked process, if that helps.

I can post the code on request, but it's bog standard. Set up a fd_set, copy it, select the copy, if the FD is set call a function which reads a byte from the same FD... otherwise revert the fd_set copy

EDIT: by request, here's the code:

Setting up my fd_set:

fd_set fds;
FD_ZERO(&fds); 
FD_SET(interp_output[0], &fds);
FD_SET(STDIN_FILENO, &fds);
struct timeval timeout, tvcopy; timeout.tv_sec=1;
int maxfd=interp_output[0]+1; //always >stdin+1
fd_set read_fds;
FD_COPY(&fds, &read_fds);

In a loop:

if (select(maxfd, &read_fds, NULL, NULL, &timeout)==-1) {perror("couldn't select"); return;}
if (FD_ISSET(interp_output[0], &read_fds)) {
    handle_interp_out();
} else if (FD_ISSET(STDIN_FILENO, &read_fds)) {
//waddstr(cmdwin, "stdin!"); wrefresh(cmdwin);
    handle_input();
}

FDCOPY(&fds, &read_fds);

handle_interp_out():

void handle_interp_out() {
    int ch;
    read(interp_output[0], &ch, 1);
    if (ch>0) {
            if (ch=='\n') { if (cmd_curline>=cmdheight) cmdscroll(); wmove(cmdwin, ++cmd_curline, 1); }
            else waddch(cmdwin, ch);
            wrefresh(cmdwin);
    }
}

EDIT 2: The write code is just a fprintf on a FILE* opened with fdopen(interp_output[1], "w") - this is in a different thread. All I'm getting to is my "prompt> " - it prints all that properly, but does one more iteration that it shouldn't. I've turned off buffering, which was giving me other problems.

EDIT 3: This has become a problem with my invocation of select(). It appears that, right away, it's returning -1 and errno's being set to 'invalid argument'. The read() doesn't know that and just keeps going. What could be wrong with my select()? I've updated the code and changed the title to more accurately reflect the problem...

EDIT 4: So now I'm thoroughly confused. The timeout value of .tv_sec=1 was no good, somehow. By getting rid of it, the code works just fine. If anybody has any theories, I'm all ears. I would just leave it at NULL, except this thread needs to periodically do updates.

Robert
  • 6,412
  • 3
  • 24
  • 26

4 Answers4

10

To absolutely guarantee that the read will never block, you must set O_NONBLOCK on the fd.

Your select error is almost certainly caused because you aren't setting the entire time struct. You're only setting the seconds. The other field will contain garbage data picked up from the stack.

Use struct initialization. That will guarantee the other fields are set to 0.

It would look like this:

struct timeval timeout = {1, 0};

Also, in your select loop you should be aware that Linux will write the time remaining into the timeout value. That means that it will not be 1 second the next time through the loop unless you reset the value to 1 second.

Zan Lynx
  • 53,022
  • 10
  • 79
  • 131
  • That fixed my timeval problem - I forgot about stack data. But it was really my original problem all along, since it was making select() return EINVAL which wasn't affecting the fd_set, which I was then checking. So thanks Alastair, checking the error is absolutely something I should check - but this was my original problem. – Robert Dec 06 '10 at 18:29
  • The first sentence of your answer is true, but only because of buggy implementations (Linux) which refuse to fix their bugs. According to POSIX, `select` is not allowed to return that a socket fd is readable unless there is data to `read`. – R.. GitHub STOP HELPING ICE Dec 07 '10 at 00:50
  • @R, See my answer. On error the return contents of and fd_sets and the timeout structure are undefined. He was getting an error, so the FD is allowed to be anything - it's undefined. – AlastairG Dec 07 '10 at 08:59
  • @Robert, your initialisation above is mixing executable code with declarations. I'm amazed it even compiles. A C++ compiler will allow that, but a proper C compiler will insist that within a block, all declarations must come before any executable code. Even in a C++ compiler it's a good idea to put all declarations in a set together before the executable code. It just makes it easier to read. – AlastairG Dec 07 '10 at 09:04
  • @R: Linux refuses to fix it because it is impossible to fix situations where the environment changes between the select and the read. – Zan Lynx Dec 07 '10 at 14:48
  • by 5 cents, if you use time.tv_usec greater than 1 sec (1M)re – Nick Nov 20 '14 at 07:55
3

According to the manpage:

On error, -1 is returned, and errno is set appropriately; the sets and timeout become undefined, so do not rely on their contents after an error.

You are not checking the return code from select().

The most likely explanation is that select() is being interrupted (errno=EINTR) and so returning an error, and the FD bit is still set in the "read" fd_set giving you the behaviour you are seeing.

Incidentally it is a very bad idea to name variables after standard/system/common functions. "read_fds" would be a MUCH better name than "read".

AlastairG
  • 4,119
  • 5
  • 26
  • 41
  • It is returning an error, but EINVAL - invalid argument. I also renamed the fd_set to read_fds (how could I have missed that?) but it didn't help. – Robert Dec 06 '10 at 17:27
2

It's correct. See the select() manpage in Linux for example: http://linux.die.net/man/2/select

"Under Linux, select() may report a socket file descriptor as "ready for reading", while nevertheless a subsequent read blocks"

The only solution is to use NON-BLOCKING socket.

ama
  • 41
  • 2
  • I'm not using sockets - just a regular pipe. That's something I didn't know, but I don't think it's causing me problems in this particular instance. – Robert Dec 06 '10 at 08:56
  • Opps. I missed the pipe thing... Sorry – ama Dec 06 '10 at 09:02
0

The answer is "it wouldn't". The behaviour you describe should never happen. Something else is going wrong.

I have been in a similar situation many times, and usually it turned out that there was a silly typo or cut and paste error elsewhere that led to behaviour that I mis-diagnosed.

If you post your code maybe we can help better - post the write code as well as the read please.

Also consider using asynchronous IO, even if only for debug purposes. If what you suspect is happening really is happening, then read() will return EWOULDBLOCK.

Also, you say you are copying the fd_set. How? Can you post your code for that?

AlastairG
  • 4,119
  • 5
  • 26
  • 41
  • See, I didn't think I was crazy. Posted the code above - the body of the if(ch>0) just puts the character on the screen; that works. The fd_sets are just local variables - assigning them *should* copy. – Robert Dec 06 '10 at 08:51
  • You can tell it's Monday morning. I mis-read your code and posted a stupid answer. Give me a sec. Anyway, you should be checking the return value from select(). It's possible that if select() fails (maybe due to an interrupt) it won't have modified the fd_set yet. – AlastairG Dec 06 '10 at 09:35