awk print to the top of the output file

Question

I have an input text file with paragraphs in it, which are separated by 3 empty lines. Example:

P1
P1
empty line here
empty line here
empty line here
P2
P2
empty line here
empty line here
empty line here
P3
P3
empty line here
empty line here
empty line here

Currently I'm using this code written into a *.awk file to get the paragraphs:

BEGIN{ORS=RS="\n\n\n"}
/some text pattern comes here because I dont want to print every paragraph just some of them but in reversed order/

So I'd like the output file to look like this:

P3
P3
empty line here
empty line here
empty line here
P2
P2
empty line here
empty line here
empty line here
P1
P1
empty line here
empty line here
empty line here

So I was wondering if I could print each paragraph to the top of the output file to get the reversed order. Is it possible to do it?

Why on earth would you write "empty line here" instead of just having an empty line??? Now we need to delete that text to create the sample input and expected output if we want to test a potential solution. Note that only gawk supports multi-char RS values, POSIX awks are free it ignore all but the first char. You MAY want to look into `RS=""`. If you fix your sample input and expected output to be testable as-is, others might take a look at it. — Ed Morton, Jan 11 '16 at 04:32

dawg · Accepted Answer · 2016-01-11T15:49:16.547

2

If you set RS="" then awk will separate multi-line records separated by blank lines.

Given:

$ cat /tmp/so.txt
P1
P1



P2
P2



P3
P3

You can then grab $0 which is each record and then reverse that records:

$ awk 'BEGIN{RS=""} {a[i++]=$0} END {while(i--){ print a[i]; print "\n\n\n"}}' /tmp/so.txt
P3
P3




P2
P2




P1
P1

If you have a fixed three blanks line separator (and you have gawk), you can also just do:

$ awk 'BEGIN{RS="\n\n\n"} {a[i++]=$0} END {while(i--) print a[i]}' /tmp/so.txt

edit based on comment

Given:

P1 a
P1 b

P2 a filter this block
P2 b

P3 a
P3 b

You can add a pattern to filter unwanted blocks:

$ awk 'BEGIN{RS=""} /filter/ {next} {a[i++]=$0} END {while(i--){ print a[i]; print "\n"}}' /tmp/so.txt
P3 a
P3 b


P1 a
P1 b

edited Jan 11 '16 at 15:49

answered Jan 11 '16 at 01:19

dawg

98,345
23
131
206

Works very well and fast like a charm, however If I want to filtering out paragraphs I have to run my other script (in the sample) after I ran this one, but it's a small issue so I accepted this answer as it suits my needs the best. – sasieightynine Jan 11 '16 at 12:46
You can add you pattern filter to this script or use in a pipeline as well. Cheers – dawg Jan 11 '16 at 15:41

score 0 · Answer 2 · answered Jan 11 '16 at 01:10

0

tac inputfile | tail -n +4 | awk '{print};END{printf("\n\n\n")}'

This (tac) will reverse the order of inputfile, remove the blanks at the top (tail), then print everything but with 3 trailing newlines at the end (since tac disappeared those).

answered Jan 11 '16 at 01:10

Erik Bryer

96
6

`tac` is only on Linux it should be noted. – dawg Jan 11 '16 at 01:20
Just for future readers on OSX, you can use `tail -r filename` in place of `tac`. – Mark Setchell Jan 11 '16 at 15:46
Does OS X lack GNU coreutils? – Erik Bryer Jan 14 '16 at 20:03

andrnev · Answer 3 · 2016-01-11T02:28:26.720

Would this work for you?

cat -n inputfile | sort -r | grep -i 'pattern' | awk -F'\t' 'ORS="\n\n\n" {print $2}'

Explanation

cat -n inputfile           # number each line in the file
sort -r                    # sort in reverse order
grep -i 'pattern'          # grep out paragraphs with your text pattern
awk -F'\t' 'ORS="\n\n\n" {print $2}'
                           # awk out the numbers and print the second column

For example if your inputfile is

Pz - The quick brown fox jumped over the lazy dog
Pz - The quick blue fox jumped over the lazy dog



Pa - The quick brown fox jumped over the lazy dog
Pa - The quick blue fox jumped over the lazy deer



Px - The quick brown fox jumped over the lazy cat
Px - The quick bronw fox jumped over the lazy dog

Running the following to grep out the paragraphs with text pattern "blue"

cat -n inputfile | sort -r | grep -i 'blue' | awk -F'\t' 'ORS="\n\n\n" {print $2}'

would give you

Pa - The quick blue fox jumped over the lazy deer


Pz - The quick blue fox jumped over the lazy dog

awk print to the top of the output file

3 Answers3