2

I found this (here if you must know), and it caught my attention.

$ perl -ne 'print if ($seen{$_} .= @ARGV) =~ /10$/' file1 file2

I do know perl. But I do not know how this does what it does.

$ perl -ne 'print if ($seen{$_} .= @ARGV) =~ /10$/' <(echo 'zz\nabc\n3535\ndef') <(echo 'abc\ndef\nff')
abc
def

Seems like it just spits out the lines of the input files that are shared. Now putting every line into a hash as key, or something, I can see how it can help achieve that task, but... What the hell is going on with that regex?

Thinking about it some more, nothing about the use of .= is obvious either.

Community
  • 1
  • 1
Steven Lu
  • 41,389
  • 58
  • 210
  • 364

1 Answers1

6
  • The expression $seen{$_} .= @ARGV appends the number of elements in @ARGV to $seen{$_}

  • While the first file is being read, @ARGV contains only one element -- the second file name

  • While the second file is being read, @ARGV is empty

  • The value of $_, which is used as the key for the %seen hash, is the latest line read from either file

  • If any given line appears only in the first file, only a 1 will be appended to the hash element

  • If any given line appears only in the second file, only a 0 will be appended to the hash element

  • If any given line appears in both files, a 1 and then a 0 will be appended to the hash element, leaving it set to 10

  • When reading through the second file, if the appended 0 character results in a value of 10 then the line is printed

  • This results in all lines that appear in both files being printed to the output

Steven Lu
  • 41,389
  • 58
  • 210
  • 364
Borodin
  • 126,100
  • 9
  • 70
  • 144
  • OK.. so this (one of the many items here) i wasnt aware of... If the program is one line, then it is used as a line filter on each arg (reading each arg in succession)? So essentially we have a doubly nested loop. All implicit. Wow. – Steven Lu Aug 13 '15 at 02:19
  • 1
    @StevenLu: The `-n` option executes the code for every line read from the input files. It's only a singly-nested loop as the files are read in sequence as if they are concatenated. Perl shifts another file name off `@ARGV` and opens it each time it needs more data – Borodin Aug 13 '15 at 02:22
  • Oh ok i missed the significance of the `-n` flag. It's less astonishing now. – Steven Lu Aug 13 '15 at 06:01