2

So I have a program that I want to clean some text files. The program asks for the user to enter the full pathway of a directory containing these text files. From there I want to read the files in the directory, print them to a new file (that is specified by the user), and then clean them in the way I need. I have already written the script to clean the text files.

I ask the user for the directory to use:

chomp ($user_supplied_directory = <STDIN>); 
opendir (DIR, $user_supplied_directory);

Then I need to read the directory.

my @dir = readdir DIR;

foreach (@dir) {

Now I am lost.

Any help please?

CanSpice
  • 34,814
  • 10
  • 72
  • 86
AlphaA
  • 113
  • 2
  • 9
  • how does the user specify the new file (especially given there will be multiple new files) – ysth Nov 26 '10 at 00:17

4 Answers4

2

I'm not certain of what do you want. So, I made some assumptions:

  • When you say clean the text file, you meant delete the text file
  • The names of the files you want to write into are formed by a pattern.

So, if I'm right, try something like this:

chomp ($user_supplied_directory = <STDIN>);

opendir (DIR, $user_supplied_directory);
my @dir = readdir DIR;

foreach (@dir) {
    next if (($_ eq '.') || ($_ eq '..'));

    # Reads the content of the original file
    open FILE, $_;
    $contents = <FILE>;
    close FILE;

    # Here you supply the new filename
    $new_filename = $_ . ".new";

    # Writes the content to the new file
    open FILE, '>'.$new_filename;
    print FILE $content;
    close FILE;

    # Deletes the old file
    unlink $_;
}
Doug
  • 6,322
  • 3
  • 29
  • 48
  • 2
    You should check the results of your `opendir` and `open` calls, or otherwise `use autodie;` – friedo Nov 26 '10 at 02:33
2

I would suggest that you switch to File::Find. It can be a bit of a challenge in the beginning but it is powerful and cross-platform.

But, to answer your question, try something like:

my @files = readdir DIR;
foreach $file (@files) {
   foo($user_supplied_directory/$file);
}

where "foo" is whatever you need to do to the files. A few notes might help:

  • using "@dir" as the array of files was a bit misleading
  • the folder name needs to be prepended to the file name to get the right file
  • it might be convenient to use grep to throw out unwanted files and subfolders, especially ".."
igelkott
  • 1,287
  • 8
  • 9
  • File::Find? nothing in the question makes me think nested directories should be searched. – ysth Nov 26 '10 at 02:44
  • Using *“cross-platform”* to mean *“even works on Microsoft”* is a weasel-worded euphemism smacking of political correctness gone mad. – tchrist Nov 26 '10 at 19:43
  • @tchrist I like code that works across platforms without much extra effort on my part. It's just convenient, not politically correct. In this case, I couldn't tell which OS was being used ... and frankly didn't really care. – igelkott Nov 28 '10 at 00:28
  • @ysth I like using File::Find in general but maybe it's too much bother here. Especially if it's unfamiliar. – igelkott Nov 28 '10 at 00:41
1

I wrote something today that used readdir. Maybe you can learn something from it. This is just a part of a (somewhat) larger program:

our @Perls = ();

{
    my $perl_rx = qr { ^ perl [\d.] + $ }x;
    for my $dir (split(/:/, $ENV{PATH})) {
        ### scanning: $dir
        my $relative = ($dir =~ m{^/});
        my $dirpath = $relative ? $dir : "$cwd/$dir";
        unless (chdir($dirpath)) {
            warn "can't cd to $dirpath: $!\n";
            next;
        }
        opendir(my $dot, ".") || next;
        while ($_ = readdir($dot)) {
            next unless /$perl_rx/o;
            ### considering: $_
            next unless -f;
            next unless -x _;
            ### saving: $_
            push @Perls, "$dir/$_";
        }
    }
}

{
    my $two_dots = qr{ [.] .* [.] }x;
    if (grep /$two_dots/, @Perls) {
        @Perls = grep /$two_dots/, @Perls;
    }
}

{
    my (%seen, $dev, $ino);
    @Perls = grep {
        ($dev, $ino) = stat $_;
        ! $seen{$dev, $ino}++;
    } @Perls;
}

The crux is push(@Perls, "$dir/$_"): filenames read by readdir are basenames only; they are not full pathnames.

tchrist
  • 78,834
  • 30
  • 123
  • 180
  • Thanks for the replies guys. I think with these post I can get the job done. To clarify what I want to do. I am a PhD biology student and I work with Fasta files. They have pretty simple format, but when you copy and paste files from the internet you get more line breaks than you want. I wrote a script that will open a file remove excess white space and line breaks and then save the file to a user supplied location. – AlphaA Nov 26 '10 at 01:36
  • I now want to add a function to the script so that it will open a directory full of fasta files, open all the fasta files and copy them to one file (in a location supplied by the users). and then do the cleaning. Thanks – AlphaA Nov 26 '10 at 01:37
  • 1
    @user520742, You might find `glob("$dir/*")` easier to use than `readdir`. Did you [try Googling for Perl and Fasta](http://tinyurl.com/2agpbv4)? I notice one of the first links is [over on Perlmonks](http://www.perlmonks.org/?node_id=833644). Looks like there’s a [Bio::SeqIO::fasta](http://search.cpan.org/search?query=Bio::SeqIO::fasta) that’s part of Bio Perl. One more tip: If you edit your user profile, you can have a real name. – tchrist Nov 26 '10 at 01:50
  • Thanks for the link to the website. I did google, but as part of my PhD I am hoping to do some bioinformatics, so I am trying to write all these things one my own. I have plenty of books on this, but I was hoping seeing the solution for my exact problem will help me understand opendir/readdir/etc better. Thanks for the advice on my name as well. – AlphaA Nov 26 '10 at 01:53
  • chomp( my $directory = ); opendir( DIR, "$directory" ) || die("Oh, no! I can't open the directory; I just don't have the power!"); #Read file names in directory to array my @dir = readdir DIR; #Copy contents of each file in directory to new file. foreach my $dir (@dir) { next if ( ( $dir eq '.' ) || ( $dir eq '..' ) ); open Dir_Files, "$directory\$dir"; print MERGED "$directory\$dir\n"; close Dir_Files; } closedir DIR; – AlphaA Nov 26 '10 at 20:37
  • Well, there is the pertinent code. By the way I want this to work on linux/OS X and Windows OS. My problem is I am not getting it to print the contents of each file in directory (DIR) to the new file (MERGED). – AlphaA Nov 26 '10 at 20:39
  • @user520742: The bug is you’re using the **cursèd slackbash** as the path separator, but it means to escape things!! Use proper slashes no matter *where* you are and you’ll get a whole lot less grief. Note you can uncompile a Perl program using `perl -MO=Deparse,-q,-p,-x9 somescript` to see what it is really doing. – tchrist Nov 26 '10 at 21:09
  • @user520742: I’m a little easy on bio folks doing computation biology; I work in the department of Computational Pharmacology at the University of Colorado, and I see a lot of people like you in my work. – tchrist Nov 26 '10 at 21:11
  • Using the forward slash gave me the same error. I will try using you perl -MO=Deparse, -q, -p, -x9 It should be interesting. This is the output I get when trying to merge the files. The new file contains the pathways, c:\Test/test1.txt c:\Test/test2.txt c:\Test/test3.txt c:\Test/test4.txt c:\Test/test5.txt c:\Test/test6.txt c:\Test/test7.txt All separated on a new line.I am really just starting perl, have no real programming background, and no one around to really help me. Thanks – AlphaA Nov 27 '10 at 16:46
0

You can do the following, which allows the user to supply their own directory or, if no directory is specified by the user, it defaults to a designated location.

The example shows the use of opendir, readdir, stores all files in the directory in the @files array, and only files that end with '.txt' in the @keys array. The while loop ensures that the full path to the files are stored in the arrays.

This assumes that your "text files" end with the ".txt" suffix. I hope that helps, as I'm not quite sure what's meant by "cleaning the files".

use feature ':5.24';
use File::Copy;

my $dir = shift || "/some/default/directory";

opendir(my $dh, $dir) || die "Can't open $dir: $!";

while ( readdir $dh ) {
    push( @files, "$dir/$_");
}

# store ".txt" files in new array
foreach $file ( @files ) {
    push( @keys, $file ) if $file =~ /(\S+\.txt\z)/g;
}

# Move files to new location, even if it's across different devices
for ( @keys ) {
    move $_, "/some/other/directory/"; || die "Couldn't move files: $!\n";
}

See the perldoc of File::Copy for more info.

ILMostro_7
  • 1,422
  • 20
  • 28