1

I would like to remove strings "chr" in a following txt file, using bash:

    FO538757.1      chr1:183937
    AL669831.3      chr1:601436
    AL669831.3      chr1:601667
    AL669831.3      chr1:609395
    AL669831.3      chr1:609407
    AL669831.3      chr1:611317

So that end file looks like:

FO538757.1      1:183937
AL669831.3     1:601436
AL669831.3     1:601667
AL669831.3     1:609395
AL669831.3     1:609407
AL669831.3     1:611317
  1. I checked previous threads and tried:

    sed 's/^chr//' 
     awk 'BEGIN {OFS=FS="\t"} {gsub(/chr1/,"1",$2)}2'
    
  2. none of them worked. Is here any better option than awk?

Thank you!

Inian
  • 80,270
  • 14
  • 142
  • 161
  • 1
    Do you **really** have a bunch of blanks at the start of each line that you want to also remove? You're getting answers assuming that you do so if not then please [edit] your question to fix your example. – Ed Morton May 20 '21 at 17:51

4 Answers4

4

I suspect all you really need is:

sed 's/chr//' file
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
2

You can do that quite easily with sed and two expressions, (1) the first to remove chr and the second to remove leading whitespace, e.g.

sed -e 's/chr//' -e 's/^[[:blank:]]*//'  file

Example Use/Output

With your input in the file named file, you would have

$ sed -e 's/chr//' -e 's/^[[:blank:]]*//'  file
FO538757.1      1:183937
AL669831.3      1:601436
AL669831.3      1:601667
AL669831.3      1:609395
AL669831.3      1:609407
AL669831.3      1:611317
David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
2

With your shown samples, please try following. Simple explanation would be: substituting starting chr with NULL in 2nd field and printing the line then, which will cause reconstruct of current line and initial spaces will be removed too from line.

awk '{sub(/^chr/,"",$2)} 1' Input_file

In case your Input_file is tab delimited and having tabs in starting of file then try following:

awk 'BEGIN{FS=OFS="\t"} {sub(/^chr/,"",$3);sub(/^\t+/,"")} 1' Input_file
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
0

Using the bash shell with Parameter Expansion and mapfile aka readarray

#!/usr/bin/env bash

shopt -s extglob

mapfile -t array < file.txt

array=("${array[@]##+([[:space:]])}")

printf '%s\n' "${array[@]/chr}"

Inside the script extglob must be enable but in the command line it might be enabled already, so in-one-line

mapfile -t array < file.txt; array=("${array[@]##+([[:space:]])}"); printf '%s\n' "${array[@]/chr}"

It will be very slow for large set of data/files, jfyi

Jetchisel
  • 7,493
  • 2
  • 19
  • 18