Removing alpha numeric suffix in the file

Question

I have a certain lines that have IDs in 4 th column of a file that end with ':E1' (I want to remove these lines that have 4 th column specifically ending with :E1 not :E11 or :E10 s etc ). When I do a grep I get that there are 87 lines having that pattern

    grep "\:E1\b" File | wc -l
    87

However when I do

    sed '/:E1$/d' File > tmp
    wc -l File
    245797 File 
    wc -l tmp
    245797 tmp

which is same as the original file length, which indicates that the lines with pattern ending with :E1 is not getting removed. Where am I going wrong in understanding the command? The file looks like this

chr1    133374  133566  ENSG00000238009:E1  -   ENSG00000238009 1
chr1    995083  995226  ENSG00000217801:E1  +   ENSG00000217801 1
chr1    1385294 1385499 ENSG00000215915:E1  +   ENSG00000215915 1
chr1    10003388    10003465    ENSG00000162441:E1  -   ENSG00000162441 1
chr1    38273332    38273352    ENSG00000197982:E1  +   ENSG00000197982 1

I want to delete the lines ending in :E1 in the 4 th column

In your regex `:E1$` `$` means the end of line. None of the lines end with `:E1`. — James Brown, Sep 19 '16 at 19:08
I edited it to "that the lines with pattern ending with :E1 is not getting removed" — AishwaryaKulkarni, Sep 19 '16 at 19:10
You say " want to remove the ones specifically ending with :E1" and "I want to delete the lines ending in :E1 in the 4 th column" and "that the lines with pattern ending with :E1 is not getting removed". Which is it? Please update that question. — James Brown, Sep 19 '16 at 19:14

James Brown · Accepted Answer · 2016-09-19T19:18:54.970

2

I want to delete the lines ending in :E1 in the 4 th column:

$ awk '$4 !~ /:E1$/' foo
$

edited Sep 19 '16 at 19:18

answered Sep 19 '16 at 19:12

James Brown

36,089
7
43
59

1

It does thanks , I did the following awk '$4 !~ /:E1$/' File > tmp ; wc -l tmp which shows 245710 tmp ; wc -l File which shows 245797 File – AishwaryaKulkarni Sep 19 '16 at 19:25

score 1 · Answer 2 · answered Sep 19 '16 at 17:17

1

Search for :E1 at at of line ($) and replace by nothing:

sed 's/:E1$//' File

answered Sep 19 '16 at 17:17

Cyrus

84,225
14
89
153

Upload your file somewhere and post a link here. – Cyrus Sep 19 '16 at 17:38

Removing alpha numeric suffix in the file

2 Answers2