1

I am trying to add a column to a file:

1       12098   12258   0.00
1       12553   12721   1.37
1       13331   13701   34.69
1       30334   30503   0.00
1       35045   35544   0.00
1       35618   35778   0.00
1       69077   70017   0.24
1       324294  324394  0.68
1       324427  325605  3.18

so it looks like this:

1       12098   12258   unknown   0.00
1       12553   12721   unknown   1.37
1       13331   13701   unknown   34.69
1       30334   30503   unknown   0.00
1       35045   35544   unknown   0.00
1       35618   35778   unknown   0.00
1       69077   70017   unknown   0.24
1       324294  324394  unknown   0.68
1       324427  325605  unknown   3.18

I have managed to do it using this command:

awk '$3 = $3 FS "unknown"' <file> > <new_file>

However I have over 900 files that I need to do this too and output to a new file on each occasion. I find awk complicated to understand and was wondering whether there is a way to do this using #SBATCH scripts or any other method for multiple files at a time?

I am pretty new to stack overflow, so any help would be greatly appreciated! Thank you!

Inian
  • 80,270
  • 14
  • 142
  • 161
hdjc90
  • 77
  • 6
  • Does retaining the spacing between columns matter to you? If so is that tabs or blanks or something else? – Ed Morton Apr 27 '20 at 16:13

2 Answers2

4

Here is a an alternative sed solution to make this change and save changes inline:

sed -E -i.bak 's/[^[:blank:]]+$/unknown &/' *.txt
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • Hi @anubhava thanks for this! It worked well. I was wondering whether it is possible to perform this command only on file names that are listed in a separate file? Thanks again – hdjc90 Apr 28 '20 at 08:20
  • 1
    Let's say that filenames are stored in a file called `files.txt` then you can use: `sed -E -i.bak 's/[^[:blank:]]+$/unknown &/' $( – anubhava Apr 28 '20 at 08:22
  • 1
    Brilliant! Thanks @anubhava worked well! One final question, can you get sed to work in the background? – hdjc90 Apr 28 '20 at 08:29
  • 1
    Yes just place `&` at the end of command like: `sed -E -i.bak 's/[^[:blank:]]+$/unknown &/' $( – anubhava Apr 28 '20 at 08:33
1

EDIT: Adding solution as per OP's comments to save output into outputfiles try following.

awk '
FNR==1{
  close(out_file)
  sub(/\./,"_new&",FILENAME)
  out_file=FILENAME
}
{
  $NF="unknown" OFS $NF
  print > (out_file)
}'  *.bed


In case you are not worried about the spaces in between lines then you could try following.

awk '{$NF="unknown" OFS $NF} 1'  Input_file

OR with GNU awk latest versions try:

gawk -i inplace -v INPLACE_SUFFIX=.bak '{$NF="unknown" OFS $NF} 1'  Input_file(s)


To make spaces look good you could add column to above command too:

awk '{$NF="unknown" OFS $NF} 1'  Input_file | column -t

OR with GNU awk latest versions try:

gawk -i inplace -v INPLACE_SUFFIX=.bak '{$NF="unknown" OFS $NF} 1'  Input_file | column -t
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
  • Thanks for your suggestion. I am wanting to do this simultaneously for multiple files is there a way to do this? Thanks – hdjc90 Apr 27 '20 at 14:02
  • @hdjc90, if you have latest version of GNU awk it has inplace save along with backup of files too, remove backup files option `-v INPLACE_SUFFIX=.bak` in case you don't want backup of files, lemme know? – RavinderSingh13 Apr 27 '20 at 14:07
  • Hi @RavinderSingh13 thanks for your help. To clarify, I would like to enter a command to do all my files (e.g. *.bed) and output them all (e.g. *_new.bed)? – hdjc90 Apr 27 '20 at 14:11
  • @hdjc90, Could you please check my EDIT solution and lemme know if that helps you? – RavinderSingh13 Apr 27 '20 at 16:07