0

I have a task where I need to parse through files and extract information. I can do this easy using bash but I have to get it done through unix commands only.

For example, I have a file similar to the following:

 Set<tab>one<tab>two<tab>three
 Set<tab>four<tab>five<tab>six
 ENDSET

 Set<tab>four<tab>two<tab>nine
 ENDSET

 Set<tab>one<tab>one<tab>one
 Set<tab>two<tab>two<tab>two
 ENDSET

 ...

So on and so forth. I want to be able to extract a certain number of sets, say the first 10. Also, I want to be able to extract info from the columns.

Once again, this is a trivial thing to do using bash scripting, but I am unsure of how to do this with unix commands only. I can combine the commands together in a shell script but, once again, only unix commands.

basil
  • 690
  • 2
  • 11
  • 30
  • 1
    What do you mean by _Unix commands_? Is `awk` a Unix command? And `python` ? – mouviciel Nov 22 '16 at 18:39
  • I suppose I could pipe things through sed, awk, or even perl, as that is the only way I can think to do it without actually scripting it out via e.g. bash. I am just, unfortunately, ass at all of those. – basil Nov 22 '16 at 18:40
  • 1
    How would you do it "using bash"? That's pretty much the same as using "unix commands only". – William Pursell Nov 22 '16 at 18:50
  • I mean running cat on a file and then parsing it out that way, perhaps using intermediate files but no shell scripting – basil Nov 22 '16 at 20:27
  • 1
    I have no idea what you want. Is there any desired output you want to achieve? – glenn jackman Nov 22 '16 at 20:59

1 Answers1

0

Without an output example, it's hard to know your goal, but anyway, one UNIX command you can use is AWK.

Examples:

Extract 2 sets from your data sample (without include "ENDSET" nor blank lines):

$ awk '/ENDSET/{ if(++count==2) exit(0);next; }NF{print}' file.txt
Set     one     two     three
Set     four    five    six
Set     four    two     nine

Extract 3 sets and print 2nd column only (Note 1st column is always "Set"):

$ awk '/ENDSET/{ if(++count==3) exit(0);next; }$2{print $2}' file.txt
two
five
two
one
two

And so on... (more info: $ man awk)

Wilfredo Pomier
  • 1,091
  • 9
  • 12