2

After reopening STDOUT stream, the message does not display on my screen if calling print() like this:

printf("The message disappeared\n")

The snippet code for explaining the problem:

#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <unistd.h>
#include <stdarg.h>
#include <unistd.h>
#include <errno.h>

int main(void)
{
    printf("Display a message\n");
    int fd, fd_copy, new_fd;
    FILE *old_stream = stdout;

    fd = STDOUT_FILENO;

    fd_copy = dup(fd);
    fclose(old_stream);
    new_fd = dup2(fd_copy, fd);
    close(fd_copy);
    FILE *new_stream = fdopen(fd, "w");
    stdout = new_stream;

    printf("test %d\n", 1);
    fprintf(stdout, "test 2\n");
    int rc = printf("test 3\n");
    printf("Test 4 Why the message disappeared\n");
    printf("Error message is [%s]\n", strerror(errno));

    return 0;
}

Why only the test 4 cannot display on my screen. Don't they all use stdout as output?

Output:

# gcc main.c; ./a.out
Display a message
test 1
test 2
test 3      
Error message is [Bad file descriptor]

The code snippet above comes from the LVM2 library function.

int reopen_standard_stream(FILE **stream, const char *mode)
/* https://github.com/lvmteam/lvm2/blob/master/lib/log/log.c */

The dynamic library I designed:

I wrap up a dynamic library it includes the LVM dynamic library for other processes uses. One of the functions is this (output all PVs in the system):

char global_buffer[0x1000];
void show_pvs_clbk_fn(int level, const char *file, int line,
    int dm_errno, const char *format)
{
    /* Extract and process output here rather than printing it */
    if (level != LVM2_LOG_PRINT)
        return;
    sprintf(global_buffer, "%s%s\n", global_buffer, format)
}

int show_all_PV(char *out_buffer)
{
    void *handle = lvm2_init();
    lvm2_log_fn(show_pvs_clbk_fn);
    int rc = lvm2_run(handle, "pvs");
    lvm2_exit(handle);
    if (rc != LVM2_COMMAND_SUCCEEDED) {
        return -1;
    }

    strcpy(out_buffer, global_buffer)
    return 0;
}

A caller may calls show_all_PV() API like this:

 int main(void)
 {
    char tmp[0x1000];
    if (!show_all_PV(tmp)) {
        printf("====== PVS are ======\n");
        printf("%s\n", tmp);
    }
 }

Output:

====== PVS are ======
 PV             VG   Fmt  Attr PSize  PFree
 /dev/nvme1n1p1 vg1  lvm2 a--  <1.2t   1.1t

Some caller maybe mess up the stdout:

I found a stranger thing is that if the caller defines a function which includes vfprintf(stdout, ) system call. they never get output from normal print() API.

#inclide <stdlin.h>
#inclide <stdio.h>
#inclide <unistd.h>
#inclide <stdarg.h>

#if 1
int a_function_never_be_called(const char *formatP, ...)
{
    va_list  ap;
    va_start(ap, formatP);
    vfprintf(stdout, formatP, ap);
    va_end(ap);
    return 0;
}
#endif

int main(void)
     {
        char tmp[0x1000];
        if (!show_all_PV(tmp)) {
            printf("====== PVS are ======\n");
            printf("%s\n", tmp);
        }
     }

The string "====== PVS are ======" disappeared and the caller got an IO error Bad file descripto.

Output:

 PV             VG   Fmt  Attr PSize  PFree
 /dev/nvme1n1p1 vg1  lvm2 a--  <1.2t   1.1t
Sean
  • 25
  • 3
  • 7
    The assignment `stdout = new_stream;` is dubious at best — undefined behaviour. – Jonathan Leffler Jan 27 '22 at 14:01
  • 1
    Well. The code snippet shown is just plain wrong; you may have found a bug in the LVM2 library. But the problem is it's _so_ wrong that I don't understand what it's _supposed_ to do, and I also don't understand what you expect it to do (which may be two different things). Could you please explain, in excruciating detail, what you think your test program _ought_ to do? What effect do you think that messing around with stdout _should_ have? – zwol Jan 27 '22 at 14:02
  • (Explain by editing your question, please. Did you know you can edit your question? The tiny gray word "edit" under the tags is a button. Yes, it's terrible UI design. Stack Exchange Inc. doesn't care.) – zwol Jan 27 '22 at 14:03
  • Turning the code fragments into [mre]s would also be helpful. I suspect that you are not telling us about some important `#include`s and/or `using`s. – Yunnosch Jan 27 '22 at 14:38
  • Thanks, guys, now I know assignment stdout is undefined behaviour. – Sean Jan 27 '22 at 16:30
  • 1
    I still don't properly understand what the LVM library was trying to accomplish but I filed https://github.com/lvmteam/lvm2/issues/64 . – zwol Jan 27 '22 at 17:10

1 Answers1

6

Assigning to stdout (or stdin or stderr) is Undefined Behaviour. And in the face of undefined behaviour, odd things happen.

Technically, no more needs to be said. But after I wrote this answer, @zwol noted in a comment that the glibc documentation claims to allow reassignment of standard IO streams. In those terms, this behaviour is a bug. I accept this fact, but the OP was not predicated on the use of glibc, and there are many other standard library implementations which don't make this guarantee. In some of them, assigning to stdout will raise an error at compile time; in others, it will simply not work or not work consistently. In other words, regardless of glibc, assigning to stdout is Undefined Behaviour, and software which attempts to do so is, at best, unportable. (And, as we see, even on glibc it can lead to unpredictable output.)

But my curiosity was aroused so I investigated a bit. The first thing is to look at the actual code generated by gcc and see what library function is actually being called by each of those output calls:

    printf("test %d\n", 1);         /* Calls printf("test %d\n", 1); */
    fprintf(stdout, "test 2\n");    /* Calls fwrite("test 2\n", 1, 7, stdout); */
    int rc = printf("test 3\n");    /* Calls printf("test 3\n"); */
    printf("Test 4 Why the message disappeared\n");
                                    /* Calls puts("Test 4...disappeared"); */
    printf("Error message is [%s]\n", strerror(errno));
                                    /* Calls printf("..."); */

Note that GCC is trying hard to optimise the calls. In lines 2 and 4, it is able to find a non-printf library call, avoiding run-time parsing of the format string.

But note that it does not do that in the case of line 3, which looks the same as line 4. Why not? Because you are using the return value of printf, which is the number of characters sent to stdout. But that's not the same as the return value of puts, which just returns a "non-negative number" on success. So the substitution is impossible.

Suppose we remove int rc = from line 3, and recompile. Now we get this:

    printf("test %d\n", 1);         /* Calls printf("test %d\n", 1); */
    fprintf(stdout, "test 2\n");    /* Calls fwrite("test 2\n", 1, 7, stdout); */
    printf("test 3\n");             /* Calls puts("test 3"); */
    printf("Test 4 Why the message disappeared\n");
                                    /* Calls puts("Test 4...disappeared"); */
    printf("Error message is [%s]\n", strerror(errno));
                                    /* Calls printf("..."); */

So without the use of the return value, GCC can substitute printf with puts. (Note also that when it does that substitution, it also removes the \n from the string literal, because puts automatically adds a newline to the end of its output.)

When we run the modified program, we see this:

Display a message
test 1
test 2
Error message is [Bad file descriptor]

Now, two lines have disappeared, which are precisely the two lines for which GCC used puts.

After the shenanigans at the beginning, puts no longer works, presumably because it relies on stdout not having been reassigned. Which it's allowed to do, because reassigning stdout is Undefined Behaviour. (You can use freopen if you want to reopen stdout.)


Final note:

Unsurprisingly, it turns out that the glibc team did accept it as a bug; it was reported as bug 24051 and a similar issue with stdin as bug 24153. Both were fixed in glibc v2.30, released in August of 2019. So if you have a recently upgraded Linux install, or you are reading this answer years after I wrote it, you might not see this bug.

rici
  • 234,347
  • 28
  • 237
  • 341
  • 1
    Hold on a sec, reassigning `stdout` is a documented implementation extension in GNU libc, see https://www.gnu.org/software/libc/manual/html_node/Standard-Streams.html It might have gotten broken by accident, but it's weird that the breakage would affect `puts` and not `printf`. – zwol Jan 27 '22 at 16:24
  • 1
    @zwol: I don't pretend to understand all the games glibc plays with global names, and it may well be a bug. At the end of the day, there are two FILE pointers, `_IO_stdout` and `stdout`; initially these both point to the same struct FILE (`_IO_2_1_stdout_`), but they are still two different variables. I was confused by `libio/iolibio.h`, which seems to be trying to alias the two names, but they are certainly different. `puts` uses `_IO_stdout`. But `printf` immediately invokes `vfprintf(stdout, fmt, ap)`, so it uses `stdout`. Thus, reassigning `stdout` works with printf but not with puts. – rici Jan 27 '22 at 16:43
  • Yeah, I think that's a bug. All the `_IO_` stuff is legacy gunk that nobody really understands anymore and I'm not surprised it's broken. I can't reproduce the bug with glibc 2.33, though; which version do you have? – zwol Jan 27 '22 at 16:53
  • 1
    @zwol: the source code I was looking at is 2.28, which is the most recent I have kicking around. I see that it's changed quite a bit since then, but I didn't do a search in the bugs. But regardless of glibc, I think assigning to stdout is UB, and it will probably have weird consequences with many standard library implementations. – rici Jan 27 '22 at 16:59
  • @zwol: I found the bugzillas. Links added at end of answer. – rici Jan 27 '22 at 21:28