113

I want to do this:

  1. run a command
  2. capture the output
  3. select a line
  4. select a column of that line

Just as an example, let's say I want to get the command name from a $PID (please note this is just an example, I'm not suggesting this is the easiest way to get a command name from a process id - my real problem is with another command whose output format I can't control).

If I run ps I get:


  PID TTY          TIME CMD
11383 pts/1    00:00:00 bash
11771 pts/1    00:00:00 ps

Now I do ps | egrep 11383 and get

11383 pts/1    00:00:00 bash

Next step: ps | egrep 11383 | cut -d" " -f 4. Output is:

<absolutely nothing/>

The problem is that cut cuts the output by single spaces, and as ps adds some spaces between the 2nd and 3rd columns to keep some resemblance of a table, cut picks an empty string. Of course, I could use cut to select the 7th and not the 4th field, but how can I know, specially when the output is variable and unknown on beforehand.

jww
  • 97,681
  • 90
  • 411
  • 885
flybywire
  • 261,858
  • 191
  • 397
  • 503

10 Answers10

199

One easy way is to add a pass of tr to squeeze any repeated field separators out:

$ ps | egrep 11383 | tr -s ' ' | cut -d ' ' -f 4
unwind
  • 391,730
  • 64
  • 469
  • 606
80

I think the simplest way is to use awk. Example:

$ echo "11383 pts/1    00:00:00 bash" | awk '{ print $4; }'
bash
brianegge
  • 29,240
  • 13
  • 74
  • 99
  • 4
    For compatibility with the original question, `ps | awk "\$1==$PID{print\$4}"` or (better) `ps | awk -v"PID=$PID" '$1=PID{print$4}'`. Of course, on Linux you could simply do `xargs -0n1 – ephemient Oct 27 '09 at 16:09
  • Is the `;` in `{ print $4; }` required? Removing it seems to have no effect for me on Linux, just curious as to it's purpose – igniteflow Aug 08 '16 at 13:20
  • @igniteflow wouldn't it indicate the end of the command if you wanted to continue adding on past the print statement? – joshmcode Feb 26 '19 at 23:09
19

Please note that the tr -s ' ' option will not remove any single leading spaces. If your column is right-aligned (as with ps pid)...

$ ps h -o pid,user -C ssh,sshd | tr -s " "
 1543 root
19645 root
19731 root

Then cutting will result in a blank line for some of those fields if it is the first column:

$ <previous command> | cut -d ' ' -f1

19645
19731

Unless you precede it with a space, obviously

$ <command> | sed -e "s/.*/ &/" | tr -s " "

Now, for this particular case of pid numbers (not names), there is a function called pgrep:

$ pgrep ssh


Shell functions

However, in general it is actually still possible to use shell functions in a concise manner, because there is a neat thing about the read command:

$ <command> | while read a b; do echo $a; done

The first parameter to read, a, selects the first column, and if there is more, everything else will be put in b. As a result, you never need more variables than the number of your column +1.

So,

while read a b c d; do echo $c; done

will then output the 3rd column. As indicated in my comment...

A piped read will be executed in an environment that does not pass variables to the calling script.

out=$(ps whatever | { read a b c d; echo $c; })

arr=($(ps whatever | { read a b c d; echo $c $b; }))
echo ${arr[1]}     # will output 'b'`


The Array Solution

So we then end up with the answer by @frayser which is to use the shell variable IFS which defaults to a space, to split the string into an array. It only works in Bash though. Dash and Ash do not support it. I have had a really hard time splitting a string into components in a Busybox thing. It is easy enough to get a single component (e.g. using awk) and then to repeat that for every parameter you need. But then you end up repeatedly calling awk on the same line, or repeatedly using a read block with echo on the same line. Which is not efficient or pretty. So you end up splitting using ${name%% *} and so on. Makes you yearn for some Python skills because in fact shell scripting is not a lot of fun anymore if half or more of the features you are accustomed to, are gone. But you can assume that even python would not be installed on such a system, and it wasn't ;-).

Xennex81
  • 389
  • 2
  • 6
  • You should use quotes around the variable in `echo "$a"` and `echo "$c"` though. – tripleee Feb 16 '15 at 11:42
  • It seems though as if every piped block is executed in its own subshell or process and you can't return any variables to the enclosing block? Though you can obtain the output of that after echoing it. `var=$(....... | { read a b c d; echo $c; })`. That only works for a single (string), though in Bash you can split it into an array using `ar=($var)` – Xennex81 Mar 04 '15 at 14:24
  • @tripleee I don't think that is an issue at such a stage of the process. You'll discover soon enough whether you need that or not, and if that breaks at some point, it is a learning lesson. And then you know **why** you've had to use those double quotes ;-). And then it is no longer something you've heard say from others. Play with fire! :D. :p. – Xennex81 Mar 04 '15 at 15:21
  • elaborated answer :D – ncomputers May 23 '17 at 22:19
  • This was too helpful an answer for me to to not say so. – Ivan X Aug 11 '20 at 11:26
4

try

ps |&
while read -p first second third fourth etc ; do
   if [[ $first == '11383' ]]
   then
       echo got: $fourth
   fi       
done
James Anderson
  • 27,109
  • 7
  • 50
  • 78
4

Your command

ps | egrep 11383 | cut -d" " -f 4

misses a tr -s to squeeze spaces, as unwind explains in his answer.

However, you maybe want to use awk, since it handles all of these actions in a single command:

ps | awk '/11383/ {print $4}'

This prints the 4th column in those lines containing 11383. If you want this to match 11383 if it appears in the beginning of the line, then you can say ps | awk '/^11383/ {print $4}'.

Community
  • 1
  • 1
fedorqui
  • 275,237
  • 103
  • 548
  • 598
3

Using array variables

set $(ps | egrep "^11383 "); echo $4

or

A=( $(ps | egrep "^11383 ") ) ; echo ${A[3]}
frayser
  • 1,754
  • 10
  • 17
2

Similar to brianegge's awk solution, here is the Perl equivalent:

ps | egrep 11383 | perl -lane 'print $F[3]'

-a enables autosplit mode, which populates the @F array with the column data.
Use -F, if your data is comma-delimited, rather than space-delimited.

Field 3 is printed since Perl starts counting from 0 rather than 1

Chris Koknat
  • 3,305
  • 2
  • 29
  • 30
  • 1
    Thank you for your perl solution -- didn't know about autosplit, and *still* think perl is the tool to end the other tools.. ;). – Gerard ONeill Sep 23 '15 at 20:18
1

Getting the correct line (example for line no. 6) is done with head and tail and the correct word (word no. 4) can be captured with awk:

command|head -n 6|tail -n 1|awk '{print $4}'
soulmerge
  • 73,842
  • 19
  • 118
  • 155
0

Instead of doing all these greps and stuff, I'd advise you to use ps capabilities of changing output format.

ps -o cmd= -p 12345

You get the cmmand line of a process with the pid specified and nothing else.

This is POSIX-conformant and may be thus considered portable.

P Shved
  • 96,026
  • 17
  • 121
  • 165
0

Bash's set will parse all output into position parameters.

For instance, with set $(free -h) command, echo $7 will show "Mem:"

dman
  • 10,406
  • 18
  • 102
  • 201
  • This method is useful only when the command has a single line of output. Not generic enough. – codeforester May 29 '17 at 04:32
  • That is not true, all output is placed into positional parameters regardless of lines. ex `set $(sar -r 1 1)` ; `echo "${23}"` – dman May 29 '17 at 04:50
  • My point was that it is hard to determine the position of the argument when the output is voluminous and has many fields. `awk` is the best way to go about it. – codeforester May 29 '17 at 04:55
  • This is just another solution. The OP may not want to learn awk language for this single use case. The tags do state `bash` and not `awk`. – dman May 29 '17 at 05:03