5

I am trying to reformat a large file. The first 6 columns of each line are OK but the rest of the columns in the line need to be combined in increments of 2 with a "/" character in between.

Example file (showing only a few columns but have many more in actual file):

1       1       0       0       1       2       A       T       A       C

Into:

1       1       0       0       1       2       A/T     A/C

So far I have been trying awk and this is where I am at...

awk '{print $1,$2,$3,$4,$5; for(i=7; i < NF; i=i+2) print $i+"/"+$i+1}'  myfile.txt > mynewfile.txt
KBoehme
  • 361
  • 2
  • 5
  • 16

3 Answers3

5
awk '{for(i=j=7; i < NF; i+=2) {$j = $i"/"$(i+1); j++} NF=j-1}1' input
perreal
  • 94,503
  • 21
  • 155
  • 181
  • `$j` rewrites the value of the jth column. The code only changes columns >= 7. For these columns it concatenates the columns i and i + 1 using `"/"` as the delimiter. The assignment to NF controls how many columns will be printed. – perreal Jan 27 '20 at 18:13
  • The block `{$j = $i"/"$(i+1); j++}` is the one repeated 7 to NF times, right? What about the last 1 in `}1`? – Alexandre Rademaker Jan 27 '20 at 18:17
  • Right, starts from 7 goes up to the last column (`NF`). The last `}1` is short for `print` (awk will print the row if the block returns a true value). – perreal Jan 27 '20 at 18:23
3

Please try this:

awk '{print $1" "$2" "$3" "$4" "$5" "$6" "$7"/"$8" "$9"/"$10}' myfile.txt > mynewfile.txt
zessx
  • 68,042
  • 28
  • 135
  • 158
  • Sorry I didn't specify in my post, but I have more than 10 columns in my actual file.. Actually I have more than a million columns so I need some way to perform that combining pattern until the end of the line. All of the lines have the same number of columns though. – KBoehme Aug 07 '13 at 00:13
1

"+" is the arithmetic "and" operator, string concatenation is done by simply listing the strings adjacent to each other, i.e. to get the string "foobar" you'd write:

"foo" "bar"

not:

"foo" + "bar"

Anyway, try this:

awk -v ORS= '{print $1,$2,$3,$4,$5,$6; for(i=7;i<=NF;i++) print (i%2?OFS:"/") $i; print "\n"}'  myfile.txt > mynewfile.txt
Ed Morton
  • 188,023
  • 17
  • 78
  • 185