3

I need to remove line beginning with '#' in some txt file. but ignoring the first line as it header. how to make grep ignore first lines and remove any line beginning with # for rest of the lines?

cat sample.txt
#"EVENT",VERSION, NAME
1,2,xyz
1,2,abc
1,2,asd
1,2,ert
#"EVENT",VERSION, NAME
1,2,xyz
1,2,abc
1,2,xyz

cat sample.txt | grep -v "^\s*[#\;]\|^\s*$" > "out.txt"

but this removes the header too!

Cyrus
  • 84,225
  • 14
  • 89
  • 153
Aprilian8
  • 370
  • 1
  • 5
  • 14
  • Possible duplicate of [Omitting the first line from any Linux command output](https://stackoverflow.com/q/7318497/608639), [Print a file skipping first X lines in Bash](https://stackoverflow.com/q/604864/608639), etc. – jww Apr 21 '19 at 05:30
  • i dont think its same. I need to write header in the output file too – Aprilian8 Apr 21 '19 at 05:36

6 Answers6

6

With sed:

sed '2,${/^#/d}' sample.txt

From second row (2) to last row ($): search (/.../) for rows beginning (^) with # and delete (d) them. Default action of sed is to print current row.

Output:

#"EVENT",VERSION, NAME
1,2,xyz
1,2,abc
1,2,asd
1,2,ert
1,2,xyz
1,2,abc
1,2,xyz
Cyrus
  • 84,225
  • 14
  • 89
  • 153
2

This might work for you (GNU sed):

sed '1b;/^#/d' file

Ignore the first line and delete any other lines that start with #.

potong
  • 55,640
  • 6
  • 51
  • 83
2

Applying an arbitrary command to all but the first line - a "header" - of a file or stream of tabular data is such a common task for me that I define a helper utility called body for it:

As a shell function (put this in your ~/.bashrc or equivalent):

body() {
  IFS= read -r header
  printf '%s\n' "$header"
  "$@"
}

Now:

$ cat sample.txt | body grep -v '^#'
#"EVENT",VERSION, NAME
1,2,xyz
1,2,abc
1,2,asd
1,2,ert
1,2,xyz
1,2,abc
1,2,xyz

Credit: adapted from: Command line tools for doing data science, where it's a one of many handy data tools you can put in your shell's PATH variable. Wish many of these could be canonicalized as standard UNIX tools.

raven-rock
  • 53
  • 5
1

Try a combination of head and grep like so:

head -1 sample.txt > out.txt && grep -v "^#" sample.txt >> out.txt

Result

#"EVENT",VERSION, NAME
1,2,xyz
1,2,abc
1,2,asd
1,2,ert
1,2,xyz
1,2,abc
1,2,xyz

Alternate method

grep "^#" sample.txt | head -1 > out.txt && grep -v "^#" sample.txt >> out.txt

That is - grep lines beginning with # but just choose the first one and write it to a file. Then, grep all lines not starting with # and append those liens to the same output file.

zedfoxus
  • 35,121
  • 5
  • 64
  • 63
  • When the header line doesn't start with `#`, it goes wrong. Also wrong is `head -1 sample.txt out.txt && grep -v "^#" sample.txt >> out.txt`. – Walter A Sep 03 '22 at 13:28
1

This will cause any awk to print each line if its line number is 1 or it doesn't start with #:

$ awk 'NR==1 || !/^#/' file
#"EVENT",VERSION, NAME
1,2,xyz
1,2,abc
1,2,asd
1,2,ert
1,2,xyz
1,2,abc
1,2,xyz
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
0

tried on gnu sed

sed '0,/^#/n;/^#/d' sample.txt