5

I'm writing a program to step through a directory tree (Yes, I know about File::Find, but I'm writing a replacement).

In my program, I'm doing a readdir on a whole directory and placing it in a list. I need to do two things:

  1. Remove . and .. from the list
  2. Prepend the current directory name on each file.

I can do this with a loop, or I can use map and grep:

# Map and Grep

my @dir_stack = readdir $dir_fh;;
@dir_stack = grep { !/^\.{1,2}$/ } @dir_stack;
@dir_stack = reverse map { "$cwd/$_" } @dir_stack;
push @stack, @dir_stack;

# Read Loop

opendir $dir_fh, $cwd;
my @dir_stack;
foreach my $file (readdir $dir_fh) {
    next if $file =~ /^\.{1,2}$/;   #Skip "." and ".."
    unshift @dir_stack, "$cwd/$file";
}
push @stack, @dir_stack;

What about combining grep and map?

 opendir $dir_fh, $cwd;
 my @dir_stack = readdir $dir_fh;;
 @dir_stack = grep { !/^\.{1,2}$/ && {$_ = "$cwd/$_"} } @dir_stack;
 push @stack, reverse @dir_stack;

I want my code to be readable next week when I look at it and try to figure out what's going on. I also need my code to be efficient.

David W.
  • 105,218
  • 39
  • 216
  • 337

3 Answers3

7

Modifying $_ in grep? yuck! And what's with using an anon hash constructor?

@dir_stack = grep { !/^\.{1,2}$/ && {$_ = "$cwd/$_"} } @dir_stack;

should be

@dir_stack = map { /^\.\.?\z/ ? () : "$cwd/$_" } @dir_stack;

But I personally find using both map and grep more readable than combining them.

push @stack,
   reverse
    map "$cwd/$_",
     grep !/^\.\.?\z/,
      readdir $dh;

The need for reverse is rather odd, and it's much more visible here than hiding as a unshift, so that's another bonus.

ikegami
  • 367,544
  • 15
  • 269
  • 518
  • +1: If not combining `map` and `grep` leads to performance problems, the directory is way, way too big. – Tim Dec 12 '11 at 19:56
  • I combined `map` and `grep` in a single `grep` for performance reasons. I've noticed that `map` and `grep` aren't that much more efficient than a loop. If I used them both, wouldn't a loop would be more efficient? Reverse/unshift isn't needed, but I want to keep the files in order as they are with `find`. Does combining all of these commands into a single super command more efficient than doing it line-by-line? Can they operate in parallel, or doe `grep` have to complete and return a list before `map` can do anything. If it's the latter, I'd prefer to list each line. – David W. Dec 12 '11 at 21:04
  • @David W., `readdir` does return the same order as `find`. You're reversing it twice: Once using `reverse` and then once using a stack. To preserve order, drop the `reverse`, and use a *queue* instead of a stack (push+shift instead of push+pop). – ikegami Dec 13 '11 at 01:06
  • @David W., To paraphrase your question, "does removing three array assignments make it more efficient than doing it line-by-line", and the answer is yes. Not only is it more readable, not doing three assignments is faster than doing three assignments. – ikegami Dec 13 '11 at 01:09
  • 1
    @David W., What makes you think than combining into that grep is faster than combining into that map? – ikegami Dec 13 '11 at 01:13
2

To make your code more readable, you just need to include one more line:

# exclude '.' and '..', and prepend dir name to each elem in @dir_stack

:-)

mob
  • 117,087
  • 18
  • 149
  • 283
  • 4
    Comments shouldn't repeat the code, but state the goal. "# Full path names of directory contents." is a *much* better comment. – ikegami Dec 12 '11 at 19:55
1

Sounds like you may want glob instead. Though I believe it will exclude all files beginning with . (i.e. hidden files), not just . and ... And of course, you can't have spaces in the path.

my @stack = glob "$dir_fh/*";

It will return as long a paths as you feed it.

TLP
  • 66,756
  • 10
  • 92
  • 149
  • I'm running v5.10 and `glob "*"` includes both `.`, `..` and hidden files and directories. – flesk Dec 12 '11 at 20:46
  • @TLP : The `$]` in the codepad paste is confusing. What does `$^V` give you? – Zaid Dec 13 '11 at 04:23
  • @Zaid `version ` which is not a character recognized by my browser. Not sure why that happens, but it's on codepad's end. So I used `$]` instead. – TLP Dec 13 '11 at 05:04