4

I have a text file with the following lines:

 Ca4  0.500001 0.000000 0.000000
 C4   0.750001 0.500000 0.000000
 O10  0.750001 0.243180 0.000000
 O8   0.652432 0.628410 -0.779621
 O12  0.847569 0.628410 0.779621
 Ca3  0.120090 0.500000 -3.035668
 C3   0.370090 0.000000 -3.035668
 O9   0.370090 -0.256820 -3.035668
 O7   0.272522 0.128410 -3.815289
 O11  0.467659 0.128410 -2.256048
 Ca1  0.000000 0.000000 0.000000
 C2   0.250000 0.500000 0.000000
 O4   0.250000 0.756820 0.000000
 O6   0.152432 0.371590 -0.779621
 O2   0.347569 0.371590 0.779621
 Ca2  0.620091 0.500000 -3.035668
 C1   0.870091 0.000000 -3.035668
 O3   0.870091 0.256820 -3.035668
 O5   0.772522 -0.128410 -3.815289
 O1   0.967660 -0.128410 -2.256048

What I want to do is simply order the lines so that "Ca" (string) lines go first and the rest of the lines keep as is.

I tried using

 grep "Ca" file | sort

but it prints only in the screen the lines containing "Ca"

Any suggestions?

tripleee
  • 175,061
  • 34
  • 275
  • 318
git
  • 151
  • 9

3 Answers3

2

You pretty much have to do two filters. You can sorta avoid having to open the file twice explicitly by using tee:

< file tee >(grep ^Ca > ca) | grep -v ^Ca > noca
cat ca noca > newfile

If you want to internally sort the Ca part:

< file tee >(grep ^Ca | sort > ca) | grep -v ^Ca > noca
cat ca noca > newfile

If it's really important to you not to open the file twice, you can use awk:

awk '/^Ca/{ print }
     !/^Ca/{ na[NR]=$0; }
     END{ for(ln in na) print na[ln]; }' file

but this approach can use a lot of memory as it keeps the non-Ca parts until the end of processing.

kojiro
  • 74,557
  • 19
  • 143
  • 201
1
grep "Ca" file | sort;  grep -v  "Ca" file | sort

Will do what you need, first it will only output the sorted lines containing the "Ca" then it will print the remaining not containing the "Ca" note the "-v" parameter in grep meaning negative matching.

Also if you need the output to be in one stream you can connect the outputs using { && } syntax the command would look like this:

{ grep "Ca" file | sort &&  grep -v  "Ca" file | sort; }
cerkiewny
  • 2,761
  • 18
  • 36
  • That still doesn't make any sense. The usual statement delimiter is newline or semicolon. Using `&&` implies that the second command is conditional on the success of the first. – tripleee Sep 03 '15 at 17:00
  • It doesnt work sorry, although the file does not print, I don't want to order all lines – git Sep 03 '15 at 17:02
  • Then take out the `| sort`s. – tripleee Sep 03 '15 at 17:06
0

Here is an alternative solution

 nl -n rz ca | awk -vOFS="\t" '/Ca/{$1="#"$2} {$1=$1}1' | sort -k1,1 | cut -f2-

to simplify the file is now tab separated.

Explanation: number the lines to preserve order of other rows, change the line number to key for the rows to be sorted; sort and discard the key.

karakfa
  • 66,216
  • 7
  • 41
  • 56