5

I would like to move a specified column (the 2nd) to the last column position. I have multiple large tab-delimited files containing variable numbers of columns and rows. But, column 2 in all needs to be last. Another way to put it is that I want the order to be 1,3-last,2.

From this:

Column1 Column2 Column3 ... Column W ColumnX
1 2 3 ... W X
a b c ... apples oranges

To this:

1 3 ... W X 2
a c ... apples oranges b

I'm newish to awk. From reading other threads, I've copied and tried various things like this with no success.

#doesn't reorder columns

cut -d $'\t' -f1,3-,2 file.in > file.out


#doesn't work and I don't really understand the for(i...) stuff copied from elsewhere:
cat file.in | awk -F'\t' '{print $1,for(i=3;i<=NF;++i) $i,$2}' > file.out

help?

Any pointers to threads/links that explain in simple educational terms what's going on with the for(i...) part would be appreciated as well. I get the gist, not the syntax.

fedorqui
  • 275,237
  • 103
  • 548
  • 598
user3212388
  • 51
  • 1
  • 2
  • The reason you don't understand the `for(i...) stuff` is that it's nonsense as written. Do not refer to wherever you copied it from in future. A `for` loop in awk works the same as a `for` loop in C or any other Algol based language. – Ed Morton Jan 20 '14 at 16:49
  • My fault. I assume it was appropriate for where ever I found it, but lack of understanding and exploratory copy/paste on my part likely turned it into nonsense. Thanks – user3212388 Jan 21 '14 at 15:34

3 Answers3

9

With awk:

$ awk 'BEGIN{FS=OFS="\t"} {a=$2; for (i=2;i<NF; i++) $i=$(i+1); $NF=a}1' file
1       3       W       X       2
a       c       apples  oranges b

Which is the same as

awk 'BEGIN{FS=OFS="\t"} {a=$2; for (i=2;i<NF; i++) $i=$(i+1); $NF=a; print}' file

Explanation

  • BEGIN{FS=OFS="\t"} set input and output field separator as tab.
  • a=$2 store the 2nd value.
  • for (i=2;i<NF; i++) $i=$(i+1) move every value to the next one.
  • $NF=a set last value as the 2nd, which we stored.
  • 1 is a true condition that implies the default awk behaviour: {print $0}.
fedorqui
  • 275,237
  • 103
  • 548
  • 598
  • 1
    The `1` at the end is just an always-true condition that causes the line to be printed. Handy for playing code-golf with Awk, but outside of such games I prefer clarity over concision - a simple `;print` inside the braces, for example. – Mark Reed Jan 20 '14 at 15:44
  • 1
    @MarkReed sometimes that `1` is very handy. But you are right that it is not that clear. Updated with an alternative version containing `print`. – fedorqui Jan 20 '14 at 16:02
  • 3
    @MarkReed you can't prefer clarity over conciseness because conciseness IS clarity (with brevity). In this case, the thing I like about `1` is not what it brings to a given script but that it drives home 2 very important points that many awk newbies struggle with - the condition/action structure of the language, and the default action of `print $0`. If you don't understand those points you cant use awk effectively so either you already understand `1` and there's no problem with seeing it in a script, or you need to learn it in which case thank goodness you saw it in a script. – Ed Morton Jan 20 '14 at 17:22
  • Thank you for breaking this down for me. I can see I wasn't thinking efficiently about the problem. I was viewing it as listing columns in order that I want to print, rather than reassigning all and then printing all. Lots to learn. – user3212388 Jan 21 '14 at 15:54
3

With GNU awk for gensub():

$ cat file
Column1 Column2 Column3 ...     ColumnW ColumnX
1       2       3       ...     W       X
a       b       c       ...     apples  oranges

$ awk '{print gensub(/([\t][^\t]+)(.*)/,"\\2\\1","")}' file
Column1 Column3 ...     ColumnW ColumnX Column2
1       3       ...     W       X       2
a       c       ...     apples  oranges b
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • I'm sure if you just thought about it for a few mins you'd get it. It's really not very complicated or even interesting it's just looking for the first tab followed by non-tabs `([\t][^\t]+)` and moving the string matching that pattern to after the rest `(.*)` by referencing them as `\\1` and `\\2`. You could actually do this with `sed` just as easily. – Ed Morton Jan 21 '14 at 16:01
  • Be noted, gensub() function is in gnu awk. – BMW Jan 22 '14 at 01:42
  • Hence my opening sentence `With GNU awk for gensub():` – Ed Morton Jan 22 '14 at 02:49
0

using sed

sed -r 's/\W+(\w*)(.*)/\2\t\1/' file
BMW
  • 42,880
  • 12
  • 99
  • 116