6

My input has a mix of tabs and spaces for readability. I want to modify a field using perl -a, then print out the line in its original form. (The data is from findup, showing me a count of duplicate files and the space they waste.) Input is:

2 * 4096    backup/photos/photo.jpg photos/photo.jpg
2 * 111276032   backup/books/book.pdf book.pdf

The output would convert field 3 to kilobytes, like this:

2 * 4 KB    backup/photos/photo.jpg photos/photo.jpg
2 * 108668 KB   backup/books/book.pdf book.pdf

In my dream world, this would be my code, since I could just will perl to automatically recombine @F and preserve the original whitespace:

perl -lanE '$F[2]=int($F[2]/1024)." KB"; print;'

In real life, joining with a single space seems like my only option:

perl -lanE '$F[2]=int($F[2]/1024)." KB"; print join(" ", @F);'

Is there any automatic variable which remembers the delimiters? If I had a magic array like that, the code would be:

perl -lanE 'BEGIN{use List::Util "reduce";} $F[2]=int($F[2]/1024)." KB"; print reduce { $a . shift(@magic) . $b } @F;'
piojo
  • 6,351
  • 1
  • 26
  • 36

2 Answers2

9

No, there is no such magic object. You can do it by hand though

perl -wnE'@p = split /(\s+)/; $p[4] = int($p[4]/1024); print @p' input.txt

The capturing parens in split's pattern mean that it is also returned, so you catch exact spaces. Since spaces are in the array we now need the fifth field.

As it turns out, -F has this same property. Thanks to Сухой27. Then

perl -F'(\s+)' -lanE'$F[4] = int($F[4]/1024); say @F' input.txt

Note: with 5.20.0 "-F now implies -a and -a implies -n". Thanks to ysth.

zdim
  • 64,580
  • 5
  • 52
  • 81
  • 1
    `-F'(\s+)'` can be used as parameter, and `say` will give extra newline. – mpapec Jul 23 '17 at 08:43
  • 1
    @Сухой27 Wow. Thank you. I thought of that and "recalled" that one can only use literals in there. (I think that "recollection" came from `$/` ...?) Fixed, and added. – zdim Jul 23 '17 at 09:34
  • 1
    @piojo Updated the answer -- it turns out that there is exactly what you ask. Also fixed an error. – zdim Jul 23 '17 at 09:39
  • Thanks, knowing about `-F` and `split(//)` will be really useful! – piojo Jul 23 '17 at 10:18
  • http://perldoc.perl.org/perl5200delta.html#%2A-F%2A-now-implies-%2A-a%2a-and-%2A-a%2A-implies-%2A-n%2A – ysth Jul 23 '17 at 17:44
  • @ysth Thank you, added. – zdim Jul 23 '17 at 21:42
1

You could just find the correct part of the line and modify it:

perl -wpE's/^\s*+(?>\S+\s+){2}\K(\S+)/int($1\/1024) . " KB"/e'
ysth
  • 96,171
  • 6
  • 121
  • 214
  • 4
    Escaping `\ ` in code is oh so weird, and avoidable using an alternate delimiter (e.g. `s!...!...!e` or `s{...}{...}e`) – ikegami Jul 23 '17 at 20:34