5

Here is a very strange question about using pgrep to search which shell processes are running the same script as the current one.

Here is the test script named test.sh

#!/bin/bash

full_res=`pgrep -a -l -f 'test\.sh'`

res=$(pgrep -a -l -f 'test\.sh' | cat)

echo "short result is $full_res"

echo "weird result is $res"

With output being

sh test.sh &
[1] 19992
➜  logs short result is 19992 sh test.sh
weird result is 19992 sh test.sh
19996 sh test.sh

[1]  + 19992 done       sh test.sh

I don't know where the 19996 sh test.sh comes from, especially when using a pipe to cat. I believe it might be a bug to pgrep implementation.

Looking forward to some reasonable explanation

Thx,

Balin

machinarium
  • 203
  • 1
  • 2
  • 6

2 Answers2

9

When you created the pipeline using backticks or $(...) a subshell is created which is an exact copy of the original bash shell you called.

At the point you're doing the pgrep what you actually have is this:

bash test.sh
  └─bash test.sh
      └─ pgrep -f test.sh
      └─ cat

So pgrep is doing what you asked it to.

You can simulate this behaviour like this.

#!/bin/bash
echo mypid $$
$(sleep 60 | sleep 60 | sleep 60)

Run the process in the background, using the pid it spat out, inspect it with pstree.

$ ./test.bash 
mypid 335153
^Z
[1]+  Stopped                 ./test.bash
$ bg
[1]+ ./test.bash &
$ pstree -p 335153
test.bash(335153)───test.bash(335154)─┬─sleep(335155)
                                      ├─sleep(335156)
                                      └─sleep(335157)
Matthew Ife
  • 23,357
  • 3
  • 55
  • 72
  • I would think that the parent bash has bash child that spawns pgrep and another separate bash child that spawns cat. I think that when pgrep runs, the 2nd bash child has not yet been created. – glenn jackman Apr 13 '23 at 19:02
  • 1
    @glenn jackman thats not correct. But to illustrate I updated my answer. – Matthew Ife Apr 13 '23 at 19:10
  • 1
    The parts of a pipeline run in subshells, yes... but that should mean _two_ extra Bash process for a two-part pipe. But there's only one even in that `sleep | sleep | sleep` case. But Bash optimizes subshells with only one process by exec'ing the launched command over the shell process so we don't see those. I _think_ the Bash process actually seen there is due to the subshell involved in the _command substitution_. – ilkkachu Apr 14 '23 at 07:14
  • We don't see it in the original `res=$(pgrep)` case, again because of the optimization with exec. But it appears also with `res=$(pgrep; true)` since the second command there foils the optimization. So it's not just the pipe. You could try and see what the pstree looks like if you change that test case to `$(sleep 60; true)`? – ilkkachu Apr 14 '23 at 07:16
  • What the `sleep` example shows is that "process substitution" starts an extra shell, and that once "exec'd" the process title changes to the process being run (thus not four `bash` processes). For the original question it means that `pgrep` finds the forked shell (or the one being used for process substitution) before the child process did `exec`, I guess. – U. Windl Apr 14 '23 at 09:44
  • I should probably note that this is precisely due to the backticks that the subshell is created. I edited the answer to make that clearer. – Matthew Ife Apr 14 '23 at 10:32
  • `exec`, of course. Thanks for the thorough answer. – glenn jackman Apr 14 '23 at 12:22
5

From Pipelines in the bash manual:

Each command in a multi-command pipeline, where pipes are created, is executed in its own subshell, which is a separate process

Tangentially, this is why this won't work:

date | read theDate
echo "$theDate"

because the read command runs in a subshell, so the theDate variable is populated in the subshell, not in the current shell.

glenn jackman
  • 4,630
  • 1
  • 17
  • 20
  • Just to follow on from this `pgrep -f` is asking it to match `test.sh`. When the subshell is created, its command line is a copy of the original spawning process. So `$(this|that)` results in another process called `test.sh` that is a child of the original `test.sh` process. Hence pgrep matches its parent and its subshell. – Matthew Ife Apr 13 '23 at 18:54
  • 1
    This "answer" is correct by itself, but doesn't contribute to answering the question IMHO. – U. Windl Apr 14 '23 at 09:39