1

I have multiple files in a folder. This is how a file look like File1.txt

ghfgh gfghh
  dffd  kjkjoliukjkj
  sdf ffghf
  sf 898575
  sfkj utiith

## 
my data to be extracted 

I want to extract the line immediately below "##" pattern from all the files and write them to an output file. I want the file name to be appended too in the output file. Desired output

>File1
My data to be extracted
>File2
My data to be extracted
>File3
My data to be extracted 

This is what i tried 
awk '/##/{getline; print FILENAME; print ">"; print}' *.txt > output.txt
akang
  • 566
  • 2
  • 15
  • 1
    if you're considering using getline in future then make sure you understand everything discussed in http://awk.freeshell.org/AllAboutGetline before deciding to do so. – Ed Morton Oct 12 '18 at 22:40

4 Answers4

4

assumes one extract per file (otherwise filename header will be repeated)

$ awk '/##/{f=1; next} f{print ">"FILENAME; print; f=0}' *.txt > output.txt
karakfa
  • 66,216
  • 7
  • 41
  • 56
2

Perl to the rescue!

perl -ne 'print ">$ARGV\n", scalar <> if /^##/' -- *.txt > output.txt
  • -n reads the input line by line
  • $ARGV contains the current input file name
  • scalar <> reads one line from the input
choroba
  • 231,213
  • 25
  • 204
  • 289
1

a quick way with grep:

grep -A1 '##' *.txt|grep -v '##' > output.txt
Kent
  • 189,393
  • 32
  • 233
  • 301
0

POSIX or GNU sed:

$ sed -n '/^##/{n;p;}' file
my data to be extracted 

grep and sed:

$ grep -A 1 '##' file | sed '1d'
my data to be extracted 
dawg
  • 98,345
  • 23
  • 131
  • 206