2

Does anyone have a solution to the task of processing a multi-line string one line at a time, other than the string-as-a-filehandle solution shown below?

my $multiline_string = "line one\nline two\nline three\nline four";
my $filehandle;
open( $filehandle, '<', \$multiline_string )
    or croak("Can't open multi-line string as a filehandle: $!");
while ( defined (my $single_line = <$filehandle>) ) {
    # do some processing of $single_line here ...
}
close( $filehandle );

My reason for not wanting to use a filehandle is pretty weak. Test::Perl::Critic whines when I have more than 10 source lines between my open command and my close command on any filehandle. I'm doing quite a bit of processing of $single_line so I actually have about 40 lines of code between my open call and my close call and I don't see any way to bring that down to 10.

And I don't really want to ignore the Perl::Critic test in my build because that's actually a decent test that I'd like to pass whenever I'm opening an actual disk file in my code.

brian d foy
  • 129,424
  • 31
  • 207
  • 592
Kurt W. Leucht
  • 4,725
  • 8
  • 33
  • 45
  • 2
    If `$multiline_string` is large, the list returned by `split` will be even larger and defeat the line-by-line processing of `$multiline_string`. Either use a regular expression to match lines one at a time, or factor out the work you do to a subroutine. I personally would prefer the latter. – Sinan Ünür Oct 09 '09 at 15:59
  • wow. how silly of me not to think of the subroutine workaround. sometimes I just don't think things through. :-) – Kurt W. Leucht Oct 09 '09 at 16:16

8 Answers8

10

Make the Perl Critic happy, and make yourself even happier, by creating a subroutine, and calling it with each line of the file.

use strict; use warnings;

sub do_something {
    my ($line) = @_;
    # do something with $line
}

open my $fh, '<', \$multiline_string
    or die "Cannot open scalar for reading: $!";

while(<$fh>) {
    chomp;
    do_something($_);
}

close $fh; 
Sinan Ünür
  • 116,958
  • 15
  • 196
  • 339
Jonathan Feinberg
  • 44,698
  • 7
  • 80
  • 103
  • This is definitely the right way to do it. However, you should **always** check if open succeeded and use the 3-arg form of open with lexical filehandles. – Sinan Ünür Oct 09 '09 at 16:00
  • 1
    Or at least **always** use 3-arg open. You can "use autodie" (along with strict and warnings) and if you don't want to bother with checking whether your opens succeed. – Dave Sherohman Oct 09 '09 at 19:21
5

Um, isn't the purpose of the whine to get you to have smaller blocks of code that do just one thing? make a subroutine that does what's needed for each line.

Many people have suggested split /\n/. split /^/ is more like the filehandle way.

Sinan Ünür
  • 116,958
  • 15
  • 196
  • 339
ysth
  • 96,171
  • 6
  • 121
  • 214
3

What about:

my $multiline_string = "line one\nline two\nline three\nline four";
my @lines = split(/\n/,$multiline_string);
foreach my $line (@lines) {
    #do stuff with string
}
Tom Jefferys
  • 13,090
  • 2
  • 35
  • 36
3

I might be missing something, but could you do:

my @lines = split(/\n/,$multiline_string);
foreach my $single_line (@lines) {
  ...
}
Salgar
  • 7,687
  • 1
  • 25
  • 39
  • Don't forget you can process a multiline string with regexps using the /m or /s option, as described in perldoc perlre -- this may be easier than splitting on \n, depending on what you're searching for. – Ether Oct 09 '09 at 19:49
3

Long before I even knew you could shoehorn a multiline string into a filehandle, there was split:

foreach my $single_line (split /\n/, $multiline_string) {
    # process $single_line here
    # although note that it doesn't end in a newline anymore
}

Insert disclaimer about using literal and non-portable \n here.

mob
  • 117,087
  • 18
  • 149
  • 283
2

Perl::Critic is nice, but when you start obsessing about some of its arbitary requirements, it starts to waste your time rather than save it. I just let the filehandle go out of scope and don't worry about the close:

 my $multiline_string = "line one\nline two\nline three\nline four";

 {
     open my( $fh ), '<', \$multiline_string )
         or croak("Can't open multi-line string as a filehandle: $!");
     while ( defined (my $single_line = <$fh>) ) {
         # do some processing of $single_line here ...
     }
 }

A lot of people reach for regexes or split, but I think that's sloppy. You don't need to create a new list and use up a lot more memory in your program.

Sinan Ünür
  • 116,958
  • 15
  • 196
  • 339
brian d foy
  • 129,424
  • 31
  • 207
  • 592
0

You could use a regex.

#!/usr/bin/perl

use strict;
use warnings;

my $s = "line one\nline two\nline three\nline four";

while ($s =~ m'^(.*)$'gm) {
    print "'$1'\n";
}

die "Exited loop too early\n" unless pos $s == length $s;

Or you could use split:

for my $line ( split m'\n', $multiline_string ){

  # ...

}
Brad Gilbert
  • 33,846
  • 11
  • 78
  • 129
  • The regular expression approach is best IMHO. You do not need `\G` and `/m`. Use: `while ( $s =~ /(.+?)\n/g ) {`. `split` is wasteful because it would mean keeping two copies of essentially the same data in memory. – Sinan Ünür Oct 09 '09 at 15:53
  • 1
    *, not + there, or you'd skip empty lines. And ? is useless. \n belongs in the capture to be more like the filehandle read way. – ysth Oct 09 '09 at 16:01
  • And while \G may be unneeded, I'd keep it; when you expect to consume all string piecemeal, it's best to enforce it (with m/\G.../gc and a pos() check after the loop) so you don't accidentally miswrite your regex and lose some of the data (like your + instead of *). – ysth Oct 09 '09 at 16:03
  • @ysth Note that the OP's string does not end with a `\n`. To process that string correctly `+` would be needed and `\n` would have to be optional. – Sinan Ünür Oct 09 '09 at 16:11
  • @Sinan Ünür: then you'd need /\G(?:.*\n|.+)/gc (or some variant; many ways to do it). But I wouldn't be surprised if the real data had a newline at the end. – ysth Oct 09 '09 at 18:13
-1

Personally I like using $/ to separate the lines in a multiline string.

my $multiline_string = "line one\nline two\nline three\nline four";
foreach (split($/, $mutliline_string)) {
  process_file($_);
}
sub process_file {
  my $filename = shift;
  my $filehandle;
  open( $filehandle, '<', $filename )
      or croak("Can't open multi-line string as a filehandle: $!");
  while ( defined (my $single_line = <$filehandle>) ) {
      process_line($single_line);
  }
  close( $filehandle );
}
sub process_line {
  my $line = shift;
  ...
}
dlamblin
  • 43,965
  • 20
  • 101
  • 140