7

I made program which make fork and I think child does not affect parent.

But file pointer is changed although I did not made any changes in the parent.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int main(void) {
    FILE *fp = fopen("sm.c", "r");
    char buf[1000];
    char *args[] = {"invailid_command", NULL};

    fgets(buf, sizeof(buf), fp);
    printf("I'm one %d %ld\n", getpid(), ftell(fp));
    if (fork() == 0) {
        execvp(args[0], args);
        exit(EXIT_FAILURE);
    }
    wait(NULL);
    printf("I'm two %d %ld\n", getpid(), ftell(fp));
} 

This outputs

I'm one 21500 20
I'm two 21500 -1

And I want to make file pointer not change between two printf calls.

Why does the file pointer change and can I make the file pointer unchangeable even though execvp fails?

Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254
fnclovers
  • 119
  • 9
  • Hm. Are you actually trying to execute "`invalid_command`"? – Eugene Sh. Jun 18 '18 at 16:53
  • Yes. command is not available so It returns errono "No such file or directory" – fnclovers Jun 18 '18 at 16:56
  • You should have a look at errno, since a -1 return from ftell means an error. – PhilMasteG Jun 18 '18 at 16:57
  • There is no such an error for `ftell`... https://linux.die.net/man/3/ftell – Eugene Sh. Jun 18 '18 at 16:57
  • You can test with this! thanks for your opinion – fnclovers Jun 18 '18 at 17:01
  • What if you put a legal command there? – Eugene Sh. Jun 18 '18 at 17:06
  • I'm one 23236 20 a ishrc ishrc_parse.c out sampleish sm.c ish ishrc.c ishrc_parse.h out_err sm I'm two 23236 20 – fnclovers Jun 18 '18 at 17:08
  • You can also try removing the `execvp()` call entirely. – Andrew Henle Jun 18 '18 at 17:09
  • When I put {"ls", NULL} instead – fnclovers Jun 18 '18 at 17:09
  • Something fishy. Can't test it, but as far as I can tell, it should not behave this way. What `perror` would print in the end? – Eugene Sh. Jun 18 '18 at 17:12
  • invailid_command: No such file or directory When I print perror(args[0]); before exit(EXIT_FAILURE); – fnclovers Jun 18 '18 at 17:16
  • No, not before, but after the failing `ftell`. – Eugene Sh. Jun 18 '18 at 17:17
  • 1
    Run your process under [`strace`](http://man7.org/linux/man-pages/man1/strace.1.html): `strace -f -o /path/to/output/file ...` Post the entire output as code into the question. It almost looks like your call to `fork()` is actually using `vfork()`, where both processes share the same address space after the call and before any subsequent `exec*()` function call in the child. What exact version of Linux are you running on? – Andrew Henle Jun 18 '18 at 17:21
  • "Invalid argument" after last printf. I think unexpected execvp close all stream. – fnclovers Jun 18 '18 at 17:22
  • Ubuntu 17.10 with gcc compiler. I will do it! thanks – fnclovers Jun 18 '18 at 17:24
  • 3
    I think there's a decent chance you're running into the issue discussed/diagnosed in [Why does forking my process cause the file to be read indefinitely?](https://stackoverflow.com/questions/50110992/), or a variant of that. See also the linked question (from that other question — the link is to [Uwanted child processes being created while file reading](https://stackoverflow.com/questions/50244579)). – Jonathan Leffler Jun 18 '18 at 17:29
  • Your program does not reproduce the issue you describe for me (CentOS 7, GCC 4.8.5). – John Bollinger Jun 18 '18 at 17:49
  • Program does not reproduce the issue on [onlinegdb](https://onlinegdb.com/rJNlGuHWX) (only changed `"sm.c"` to `"/bin/bash"`). – KamilCuk Jun 18 '18 at 17:55
  • 2
    Try using `_exit` in the child instead of `exit`. That fixed it for me (Ubuntu 16.04, gcc 5.4.0). – dbush Jun 18 '18 at 18:04

2 Answers2

6

Credit to Jonathan Leffler for pointing us in the right direction.

Although your program does not produce the same unexpected behavior for me on CentOS 7 / GCC 4.8.5 / GLIBC 2.17, it is plausible that you observe different behavior. Your program's behavior is in fact undefined according to POSIX (on which you rely for fork). Here are some excerpts from the relevant section (emphasis added):

An open file description may be accessed through a file descriptor, which is created using functions such as open() or pipe(), or through a stream, which is created using functions such as fopen() or popen(). Either a file descriptor or a stream is called a "handle" on the open file description to which it refers; an open file description may have several handles.

[...]

The result of function calls involving any one handle (the "active handle") is defined elsewhere in this volume of POSIX.1-2017, but if two or more handles are used, and any one of them is a stream, the application shall ensure that their actions are coordinated as described below. If this is not done, the result is undefined.

[...]

For a handle to become the active handle, the application shall ensure that the actions below are performed between the last use of the handle (the current active handle) and the first use of the second handle (the future active handle). The second handle then becomes the active handle. [...]

The handles need not be in the same process for these rules to apply.

Note that after a fork(), two handles exist where one existed before. The application shall ensure that, if both handles can ever be accessed, they are both in a state where the other could become the active handle first. [Where subject to the preceding qualification, the] application shall prepare for a fork() exactly as if it were a change of active handle. (If the only action performed by one of the processes is one of the exec functions or _exit() (not exit()), the handle is never accessed in that process.)

For the first handle, the first applicable condition below applies. [An impressively long list of alternatives that do not apply to the OP's situation ...]

  • If the stream is open with a mode that allows reading and the underlying open file description refers to a device that is capable of seeking, the application shall either perform an fflush(), or the stream shall be closed.

For the second handle:

  • If any previous active handle has been used by a function that explicitly changed the file offset, except as required above for the first handle, the application shall perform an lseek() or fseek() (as appropriate to the type of handle) to an appropriate location.

Thus, for the OP's program to access the same stream in both parent and child, POSIX demands that the parent fflush() stdin before forking, and that the child fseek() it after starting. Then, after waiting for the child to terminate, the parent must fseek() the stream. Given that we know the child's exec will fail, however, the requirement for all the flushing and seeking can be avoided by having the child use _exit() (which does not access the stream) instead of exit().

Complying with POSIX's provisions yields the following:

When these rules are followed, regardless of the sequence of handles used, implementations shall ensure that an application, even one consisting of several processes, shall yield correct results: no data shall be lost or duplicated when writing, and all data shall be written in order, except as requested by seeks.

It is worth noting, however, that

It is implementation-defined whether, and under what conditions, all input is seen exactly once.


I appreciate that it may be somewhat unsatisfying to hear merely that your expectations for program behavior are not justified by the relevant standards, but that's really all there is. The parent and child processes do have some relevant shared data in the form of a common open file description (with which they have separate handles associated), and that seems likely to be the vehicle for the unexpected (and undefined) behavior, but there's no basis for predicting the specific behavior you see, nor the different behavior I see for the same program.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
-1

I was able to reproduce this on Ubuntu 16.04 with gcc 5.4.0. The culprit here is exit in conjunction with the way the child process is being created.

The man page for exit states the following:

The exit() function causes normal process termination and the value of status & 0377 is returned to the parent (see wait(2)).

All functions registered with atexit(3) and on_exit(3) are called, in the reverse order of their registration. (It is possible for one of these functions to use atexit(3) or on_exit(3) to register an additional function to be executed during exit processing; the new registration is added to the front of the list of functions that remain to be called.) If one of these functions does not return (e.g., it calls _exit(2), or kills itself with a signal), then none of the remaining functions is called, and further exit processing (in particular, flushing of stdio(3) streams) is abandoned. If a function has been registered multiple times using atexit(3) or on_exit(3), then it is called as many times as it was registered.

All open stdio(3) streams are flushed and closed. Files created by tmpfile(3) are removed.

The C standard specifies two constants, EXIT_SUCCESS and EXIT_FAILURE, that may be passed to exit() to indicate successful or unsuccessful termination, respectively.

So when you call exit in the child it closes the FILE represented by fp.

Normally when a child process is created, it gets a copy of the parent's file descriptors. However, in this case it seems the child's memory still physically points to the parent's. So when exit closes the FILE it is affecting the parent.

If you change the child to instead call _exit, it closes the child's file descriptor but manages to not touch the FILE object and the second call to ftell in the parent will succeed. It's good practice to use _exit in a non-exec'ed child anyway because it prevents atexit handlers from being called in the child.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • 2
    I'm not buying it. No action by one process can directly affect the memory of another, no matter the relationship between the two (or if it does then the implementation is non-conforming). But @JonathanLeffler did seem to be onto something that explains why changing from `exit()` to `_exit()` could make a difference. – John Bollinger Jun 18 '18 at 18:36
  • 4
    *However, in this case it seems the child's memory still physically points to the parent's. So when exit closes the FILE it is affecting the parent.* I'd think it's the calls such as `lseek()` on the underlying file descriptor in the child process as part flushing buffers on `exit()` that are affecting the offset of the parent's file descriptor. That's certainly a much less consequential bug than cross-process memory pollution. I haven't read all the data @JonathanLeffler linked, but his theory seems a lot more likely to me. – Andrew Henle Jun 18 '18 at 18:55
  • I have to call fclose before exit and this makes my program run proper! Thanks for your answer. – fnclovers Jun 18 '18 at 22:43