23

What was the design (or technical) reason for Perl not automatically localizing $_ with the following syntax:

while (<HANDLE>) {...}

Which gets rewritten as:

while (defined( $_ = <HANDLE> )) {...}

All of the other constructs that implicitly write to $_ do so in a localized manner (for/foreach, map, grep), but with while, you must explicitly localize the variable:

local $_;
while (<HANDLE>) {...}

My guess is that it has something to do with using Perl in "Super-AWK" mode with command line switches, but that might be wrong.

So if anyone knows (or better yet was involved in the language design discussion), could you share with us the reasoning behind this behavior? More specifically, why was allowing the value of $_ to persist outside of the loop deemed important, despite the bugs it can cause (which I tend to see all over the place on SO and in other Perl code)?


In case it is not clear from the above, the reason why $_ must be localized with while is shown in this example:

sub read_handle {
    while (<HANDLE>) { ... }
}

for (1 .. 10) {
     print "$_: \n"; # works, prints a number from 1 .. 10
     read_handle;
     print "done with $_\n";  # does not work, prints the last line read from
                              # HANDLE or undef if the file was finished
}
Eric Strom
  • 39,821
  • 2
  • 80
  • 152
  • 3
    This is why I (almost) never trust $_. – Ted Hopp Mar 31 '11 at 19:28
  • i think you meant *{ local $_; while () { ... } }* – MkV Apr 04 '11 at 09:47
  • @MkV => the above is just a code snippet (hence the `...`, no definition of `HANDLE`, etc.). The surrounding scope is implied (be it a bare block, a subroutine's block, a control statement's block...) If you downvoted the question because of that, well, that's just sad. – Eric Strom Apr 04 '11 at 14:47
  • Workaround with desired behavior: `for (local $_=;defined;$_=) ...` – mob Apr 16 '12 at 16:07
  • More Perlistic: `while (defined (local $_ = )) { ... }` – Borodin Apr 22 '14 at 18:26

4 Answers4

14

From the thread on perlmonks.org:

There is a difference between foreach and while because they are two totally different things. foreach always assigns to a variable when looping over a list, while while normally doesn't. It's just that while (<>) is an exception and only when there's a single diamond operator there's an implicit assignment to $_.

And also:

One possible reason for why while(<>) does not implicitly localize $_ as part of its magic is that sometimes you want to access the last value of $_ outside the loop.

Eugene Yarmash
  • 142,882
  • 41
  • 325
  • 378
  • To the `And also`, that's the crux of this issue though, why was that behavior deemed important enough to justify the bugs it could cause. Personally I have never needed the value of `$_` to persist beyond the loop, but have been bitten by forgetting the `local $_` on occasion. – Eric Strom Mar 31 '11 at 20:16
7

Quite simply, while never localises. No variable is associated with a while construct, so it doesn't have even have anything to localise.

If you change some variable in the while loop expression or in a while loop body, it's your responsibility to adequately scope it.

ikegami
  • 367,544
  • 15
  • 269
  • 518
  • No variable is associated with the `while` construct, but one certainly is associated with the `while ()` construct. I would think it would be as simple as adjusting the peephole optimizer that rewrites `while ()` to generate `{local $_; while (defined ($_ = )) {...}}` – Eric Strom Apr 01 '11 at 15:10
  • @Eric Strom, I can't think of any problems with that solution -- it doesn't break scoping, next/last/redo or exceptions -- but I think it would require some major changes to allow the optimiser to distinguis between `while (<$fh>)` (which should localise) and `while (defined($_ = <$fh>))` (which shouldn't). – ikegami Apr 01 '11 at 18:14
2

Speculation: Because for and foreach are iterators and loop over values, while while operates on a condition. In the case of while (<FH>) the condition is that data was read from the file. The <FH> is what writes to $_, not the while. The implicit defined() test is just an affordance to prevent naive code from terminating the loop on a read of false value.

For other forms of while loops, e.g. while (/foo/) you wouldn't want to localize $_.

While I agree that it would be nice if while (<FH>) localized $_, it would have to be a very special case, which could cause other problems with recognizing when to trigger it and when not to, much like the rules for <EXPR> distinguishing being a handle read or a call to glob.

Michael Carman
  • 30,628
  • 10
  • 74
  • 122
0

As a side note, we only write while(<$fh>) because Perl doesn't have real iterators. If Perl had proper iterators, <$fh> would return one. for would use that to iterate a line at a time rather than slurping the whole file into an array. There would be no need for while(<$fh>) or the special cases associated with it.

Schwern
  • 153,029
  • 25
  • 195
  • 336