-1

This is a sample data:

+-------+--------------------+-----------+------------+-----------+-------------+
| ID    | Name               | Status    | Networks   | Image     | Plan        |
+-------+--------------------+-----------+------------+-----------+-------------+
| 1wsd  | HostName A         | PAUSED    | IP=1.1.1.1 | Ubuntu20  | PlanA BGP40 |
| 4fgh  | An other hostname  | ACTIVE    | IP=2.2.2.2 | Ubuntu20  | PlanB BGP30 |
| zxd1  | final.destination  | REBOOTING | IP=3.3.3.3 | Debian11  | PlanA BGP10 |
| 60hn  | no problem         | ACTIVE    | IP=4.4.4.4 | Centos7   | Plan BGP90  |
+-------+--------------------+-----------+------------+-----------+-------------+

I want to print only Plan column only, but as you see, it's not a fixed column number. For example for 1.1.1.1, the plan name is in the from 11 to the end (let's suppose from the end, but we can remove the | at the end of line).

First, it should filter only the Plan column (the format is like this and is table-like), and we can exclude the first three lines of header and the one last line of the tail so that we can only have the plan names.

Expected output is only plan names:

PlanA BGP40
PlanB BGP30
PlanA BGP10
Plan BGP90

I'm googling but didn't find a way up to now.

Saeed
  • 3,255
  • 4
  • 17
  • 36
  • 1
    You can change the default field separator to what suits your needs: `awk -F'[|]' ...`. Or, to also include the spaces around `|`: `awk -F'[[:space:]]*[|][[:space:]]* ...`. Note: as `|` is a regex operator we cannot just `awk -F'|'`, reason why we use `[|]`. We could also `awk -F'\\|'` (yes, double backslash). – Renaud Pacalet Jul 31 '23 at 10:31
  • @RenaudPacalet thanks, yes that's it. I think I was writing wrong because I thought of it but didn't work:( – Saeed Jul 31 '23 at 10:34
  • Aside from the suggestion by Ranaud Pacalet: If your columns are always the same fixed width, you could also simply extract the plan names by character position. – user1934428 Jul 31 '23 at 12:13
  • 2
    You don't need to google as I showed you exactly how to do that in [my answer to your previous question](https://stackoverflow.com/a/76797638/1745001) - note the `-F'[[:space:]]*[|][[:space:]]*'` being used to read the same type of `|`-separated data as you show in this question. Please read (and try) the answers you get and ask questions under the answer if there's something in it you don't understand about how it works and/or unexpected output. – Ed Morton Jul 31 '23 at 12:20

5 Answers5

4

You can use:

$ awk -F '|' 'NR>4 {gsub(/^[[:space:]]+|[[:space:]]+$/,"",v); print v} {v=$(NF-1)}' file
PlanA BGP40
PlanB BGP30
PlanA BGP10
Plan BGP90
  • Set field separator to | (-F '|')
  • Skip the first three lines and the last one. See here for how this works
  • Create a variable v which holds the second to last entry (v=$(NF-1)), then remove all leading and trailing whitespace from it (gsub(/^[[:space:]]+|[[:space:]]+$/,"",v))
Paolo
  • 21,270
  • 6
  • 38
  • 69
1

If you're interested in a alternative

sed '
  1,3d                             # delete the first 3 lines
  $d                               # delete the last line
  s/[[:blank:]]*|[[:blank:]]*$//   # remove the trailing border
  s/.*|[[:blank:]]*//              # consume all the leading cells.
' file

With sed's basic regular expressions, | is a plain character.

glenn jackman
  • 238,783
  • 38
  • 220
  • 352
1

using sedsimply:

  • 1st remove last space + pipe: |
  • Then remove all untils pipe + space: .*|
  • Oper from line 3 to end

So command will be just:

sed -ne '3,${s/ |[^|]*$//;s/^.*| //p;}'

Or

sed -ne '
    3,${
        s/ |[^|]*$//;
        s/^.*| //p;
    }
'

But if the field have to begin by plan, things become simplier:

sed -ne 's/.*\(Plan.*[^[:space:]]\)[[:space:]]*|[^|]*$/\1/p'
techno
  • 525
  • 3
  • 13
1

Here is a Ruby that allows selecting columns by name of the column:

ruby  -e '
h=Hash.new()
desired="Plan"
$<.read.split(/^\+[-+]+\+$\R/).
    select{|l| l[/\S+/]}.
    map{|l| l.split(/\n/)}.flatten.
    map{|sl| sl.split(/\s*\|\s*/)[1..]}.transpose.
    each{|a| h[a[0]]=a[1..]}
puts h[desired].join("\n")
' file

Or, this awk:

awk -v d="Plan" '
BEGIN{FS="[[:blank:]]*[|][[:blank:]]*"; idx=0}
FNR==2{for(i=1; i<=NF; i++) if (d==$i) idx=i; next}
NR>1 {print $idx}
' file

Either prints:

PlanA BGP40
PlanB BGP30
PlanA BGP10
Plan BGP90

You can print any other column by changing desired="Plan" to the desired name at the top of the column.

dawg
  • 98,345
  • 23
  • 131
  • 206
1

I think it's very easy, check this out :

awk -F "|" 'NR>3 && NR<8 {print $7}' file.txt

or another nice solution with the combination of sed and datamash

sed '1,3d;$d' file.txt | datamash -t '|' cut 7

in sed

  • '1,3d;$d': to delete lines 1 to 3 (1,3d) and the last line ($d) of the file.

in datamash

  • -t "|": Set the delimiter as "|" (pipe symbol).
  • cut 7: Select and display only the 7th column (Plan column).

output

PlanA BGP40 
PlanB BGP30 
PlanA BGP10 
Plan BGP90
Freeman
  • 9,464
  • 7
  • 35
  • 58