-1

In Perl, how do I read filenames in a directory into a Perl array only 10 at a time? All directory reading solutions I have found seem only to read the entire list of files into an array at once, but I don't want to read all files at once because there may be millions and I want to keep the memory footprint down.

zdim
  • 64,580
  • 5
  • 52
  • 81
skeetastax
  • 1,016
  • 8
  • 18

2 Answers2

2

From the readdir perldoc page you see that readdir return a single directory entry when in scalar context:

readdir DIRHANDLE

Returns the next directory entry for a directory opened by opendir. If used in list context, returns all the rest of the entries in the directory. If there are no more entries, returns the undefined value in scalar context and the empty list in list context.

Sample code from the same page:

opendir(my $dh, $some_dir) || die "Can't open $some_dir: $!";
while (readdir $dh) {
    print "$some_dir/$_\n";
}
closedir $dh;

Please note that this is sample code and in a real script you could probably be better served explicitly declaring a variable for the directory returned by readdir and not using the $_ special variable.

Dave Cross
  • 68,119
  • 3
  • 51
  • 97
Bruno Ramos
  • 146
  • 5
2

opendir is used to read the directory file by file. Then it's just a question of buffering the results until you get the right number.

my $dir_qfn = "...";

my @buf;
opendir(my $dh, $dir_qfn)
   or die("Can't open directory \"$qfn\": $!\n");

while (defined( my $fn = readdir($dh) )) {
   my $qfn = "$dir_qfn/$fn";
   push @buf, $qfn;
   process_files(splice(@buf)) if @buf == 10;
}

process_files(@buf) if @buf;

process_file(splice(@buf)); is just short for process_file(@buf); @buf = ();


That's the straightfoward way of writing it, but what if you wanted to eliminate the duplicated sub call?

my $dir_qfn = "...";

my @buf;
opendir(my $dh, $dir_qfn)
   or die("Can't open directory \"$qfn\": $!\n");

while (1) {
   my $fn = readdir($dh);
   if (defined($fn)) {
      my $qfn = "$dir_qfn/$fn";
      push @buf, $qfn;
   }

   if (!defined($fn) || @buf == 10) {
      process_files(splice(@buf));
   }

   last if !defined($fn);
}

This allows you to inline process_file. For example, if you had

sub process_files {
   print("$_\n") for @_;
}

you can now replace

if (!defined($fn) || @buf == 10) {
   process_files(splice(@buf));
}

with

if (!defined($fn) || @buf == 10) {
   print("$_\n") for splice(@buf);
}
ikegami
  • 367,544
  • 15
  • 269
  • 518