0

I have a certain lines that have IDs in 4 th column of a file that end with ':E1' (I want to remove these lines that have 4 th column specifically ending with :E1 not :E11 or :E10 s etc ). When I do a grep I get that there are 87 lines having that pattern

    grep "\:E1\b" File | wc -l
    87

However when I do

    sed '/:E1$/d' File > tmp
    wc -l File
    245797 File 
    wc -l tmp
    245797 tmp

which is same as the original file length, which indicates that the lines with pattern ending with :E1 is not getting removed. Where am I going wrong in understanding the command? The file looks like this

chr1    133374  133566  ENSG00000238009:E1  -   ENSG00000238009 1
chr1    995083  995226  ENSG00000217801:E1  +   ENSG00000217801 1
chr1    1385294 1385499 ENSG00000215915:E1  +   ENSG00000215915 1
chr1    10003388    10003465    ENSG00000162441:E1  -   ENSG00000162441 1
chr1    38273332    38273352    ENSG00000197982:E1  +   ENSG00000197982 1

I want to delete the lines ending in :E1 in the 4 th column

AishwaryaKulkarni
  • 774
  • 1
  • 8
  • 19

2 Answers2

2

I want to delete the lines ending in :E1 in the 4 th column:

$ awk '$4 !~ /:E1$/' foo
$
James Brown
  • 36,089
  • 7
  • 43
  • 59
  • 1
    It does thanks , I did the following awk '$4 !~ /:E1$/' File > tmp ; wc -l tmp which shows 245710 tmp ; wc -l File which shows 245797 File – AishwaryaKulkarni Sep 19 '16 at 19:25
1

Search for :E1 at at of line ($) and replace by nothing:

sed 's/:E1$//' File 
Cyrus
  • 84,225
  • 14
  • 89
  • 153