0

I need that all the numbers of column 4 to have 4 characters

Input

AGAP4 2061 0.534207 917.0 0 0 1
AGAP5 2061 0.536148 101.5 0 0 8
AGBL1 3201 0.514214 917.9 0 0 2
AGBL2 2709 0.444814 12.5 0 0 1

desired output

AGAP4 2061 0.534207 917.0 0 0 1
AGAP5 2061 0.536148 101.5 0 0 8
AGBL1 3201 0.514214 917.9 0 0 2
AGBL2 2709 0.444814 12.50 0 0 1
Jontexas
  • 121
  • 8

4 Answers4

1

This isn't quite what you're asking for, but provides a consistent width:

awk '{$4=sprintf("%06.2f", $4)}1' input

which produces:

AGAP4 2061 0.534207 917.00 0 0 1
AGAP5 2061 0.536148 101.50 0 0 8
AGBL1 3201 0.514214 917.90 0 0 2
AGBL2 2709 0.444814 012.50 0 0 1
William Pursell
  • 204,365
  • 48
  • 270
  • 300
1

It's quite inflexible but deals with your specific issue:

awk 'length($4) == 4 { $4 = $4 "0" }1' file

All it does is adds a 0 to the end of the 4th field if it is 4 characters long.

If the requirement is more complex, e.g. the length can vary by more than one digit, then you should update your question to show some different input.

Tom Fenech
  • 72,334
  • 12
  • 107
  • 141
1

In bash (or POSIX shell), your primary built-in tool for formatting is printf. You can read the first 4-columns of each line and the rest in some dummy variable and then print them with printf formatting the columns to a specific width each, as required:

#!/bin/bash

while read -r c1 c2 c3 c4 stuff; do 
    printf "%5s %4s %8s %5s %s\n" $c1 $c2 $c3 $c4 "$stuff"
done < "$1"

exit 0

Input

$ cat dat/agap.txt
AGAP4 2061 0.534207 917.0 0 0 1
AGAP5 2061 0.536148 101.5 0 0 8
AGBL1 3201 0.514214 917.9 0 0 2
AGBL2 2709 0.444814 12.5 0 0 1

Output

$ bash fmtagap.sh dat/agap.txt
AGAP4 2061 0.534207 917.0 0 0 1
AGAP5 2061 0.536148 101.5 0 0 8
AGBL1 3201 0.514214 917.9 0 0 2
AGBL2 2709 0.444814  12.5 0 0 1

printf in bash takes the same format string and format specifiers as it does in C. You can read about all the things you can do with formatting in man 3 printf. In addition bash adds a few, like printf -v varname "fmt string" to format and save the results in varname.

One limitation on the format string is padding. While you can 0 pad on the left, you cannot 0 pad a number on the right. Regardless of whether you use a %s a string conversion or %5.1f a floating point conversion, you are limited to left-padding and field-width specification.

You can, of course, check the length of each variable before printing, and 0 pad on the right that way, but that is about the point where you start asking can a external shell utility do that for me.... But, for completeness:

#!/bin/bash

while read -r c1 c2 c3 c4 stuff; do 
    while [ ${#c4} -lt 5 ]; do
        c4="${c4}0"
    done 
    printf "%s %s %s %s %s\n" $c1 $c2 $c3 $c4 "$stuff"
done < "$1"

exit 0

Output

$ bash fmtagap.sh dat/agap.txt
AGAP4 2061 0.534207 917.0 0 0 1
AGAP5 2061 0.536148 101.5 0 0 8
AGBL1 3201 0.514214 917.9 0 0 2
AGBL2 2709 0.444814 12.50 0 0 1
David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
1

Perl solution similar to @William's awk solution:

perl -lane '$F[3] = sprintf("%06.2f", $F[3]); print join " ",@F' input

-a autosplits each line into the @F array

output:

AGAP4 2061 0.534207 917.00 0 0 1
AGAP5 2061 0.536148 101.50 0 0 8
AGBL1 3201 0.514214 917.90 0 0 2
AGBL2 2709 0.444814 012.50 0 0 1

Using substr to produce the format you asked for:

perl -lane '$F[3] = substr(sprintf("%5.2f", $F[3]),0,5); print join " ",@F' input

AGAP4 2061 0.534207 917.0 0 0 1
AGAP5 2061 0.536148 101.5 0 0 8
AGBL1 3201 0.514214 917.9 0 0 2
AGBL2 2709 0.444814 12.50 0 0 1
Chris Koknat
  • 3,305
  • 2
  • 29
  • 30