8

To my perl script, a file is passed as an arguement. The file can be a .txt file or a .zip file containing the .txt file.

I want to write code that looks something like this

if ($file is a zip) {

    unzip $file
    $file =~ s/zip$/txt/;
}

One way to check the extension is to do a split on . and then match the last result in the array (returned by split).

Is there some better way?

Lazer
  • 90,700
  • 113
  • 281
  • 364
  • 8
    Are you sure you only want to check the extension? If you are hoping to test what type of file you are dealing with you would be better off checking the mime-type. Take a look at something like this: http://search.cpan.org/~pmison/File-Type-0.22/lib/File/Type.pm – totels Oct 15 '10 at 08:30
  • Chiming in with support for @totels and a couple of the lower rep answers. I am surprised at how many think relying on the extension is either safe (`mv virus.exe hooters.jpg`) or robust (`mv some-huge-dossy-garbage.bin whatever.zip`). Assuming zip and catching errors or exploring the MIME type are the right answers given. Any solution using the extension is a mistake. – Ashley Aug 09 '17 at 19:48

8 Answers8

14

You can use File::Basename for this.

#!/usr/bin/perl

use 5.010;
use strict;
use warnings;

use File::Basename;

my @exts = qw(.txt .zip);

while (my $file = <DATA>) {
  chomp $file;
  my ($name, $dir, $ext) = fileparse($file, @exts);

  given ($ext) {
    when ('.txt') {
      say "$file is a text file";
    }
    when ('.zip') {
      say "$file is a zip file";
    }
    default {
      say "$file is an unknown file type";
    }
  }
}

__DATA__
file.txt
file.zip
file.pl

Running this gives:

$ ./files 
file.txt is a text file
file.zip is a zip file
file.pl is an unknown file type
Dave Cross
  • 68,119
  • 3
  • 51
  • 97
12

Another solution is to make use of File::Type which determines the type of binary file.

use strict;
use warnings;

use File::Type;

my $file      = '/path/to/file.ext';
my $ft        = File::Type->new();
my $file_type = $ft->mime_type($file);

if ( $file_type eq 'application/octet-stream' ) {
    # possibly a text file
}
elsif ( $file_type eq 'application/zip' ) {
    # file is a zip archive
}

This way, you do not have to deal with missing/wrong extensions.

Alan Haggai Alavi
  • 72,802
  • 19
  • 102
  • 127
9

How about checking the end of the filename?

if ($file =~ /\.zip$/i) {

and then:

use strict;
use Archive::Extract;

if ($file =~ /\.zip$/i) {
    my $ae = Archive::Extract->new(archive => $file);
    my $ok = $ae->extract();
    my $files = $ae->files();
}

more information here.

eumiro
  • 207,213
  • 34
  • 299
  • 261
4

You can check the file extension using a regex match as:

if($file =~ /\.zip$/i) {
        # $file is a zip file 
}
codaddict
  • 445,704
  • 82
  • 492
  • 529
3

I know this question is several years old, but for anyone that comes here in the future, an easy way to break apart a file path into its constituent path, filename, basename and extension is as follows.

use File::Basename;

my $filepath = '/foo/bar.txt';

my ($basename, $parentdir, $extension) = fileparse($filepath, qr/\.[^.]*$/);
my $filename = $basename . $extension;

You can test it's results with the following.

my @test_paths = (
    '/foo/bar/fish.wibble',
    '/foo/bar/fish.',
    '/foo/bar/fish.asdf.d',
    '/foo/bar/fish.wibble.',
    '/fish.wibble',
    'fish.wibble',
);

foreach my $this_path (@test_paths) {
    print "Current path: $this_path\n";
    my ($this_basename, $parentdir, $extension) = fileparse($this_path, qr/\.[^.]*$/);
    my $this_filename = $this_basename . $extension;

    foreach my $var (qw/$parentdir $this_filename $this_basename $extension/) {
        print "$var = '" . eval($var) . "'\n";
    }

    print "\n\n";
}

Hope this helps.

Luke G
  • 127
  • 4
2

Why rely on file extension? Just try to unzip and use appropriate exception handling:

eval {
    # try to unzip the file
};

if ($@) {
    # not a zip file
}
Eugene Yarmash
  • 142,882
  • 41
  • 325
  • 378
0

Maybe a little bit late but it could be used as an alternative reference:

sub unzip_all {
     my $director = shift;
     opendir my $DIRH, "$director" or die;
     my @files = readdir $DIRH;
     foreach my $file (@files){
              my $type = `file $director/$file`; 
              if ($type =~ m/gzip compressed data/){
                      system "gunzip $director/$file";
              }
      }       
      close $DIRH;
      return;
}

Here is possible to use linux file executing it from perl by the use of backticks(``). You area able to pass the path of your folder and evaluate if exists a file that is classified by file as gzip compressed.

0

If you do not mind using a perl module, you can use Module::Generic::File, such as:

use Module::Generic::File qw( file );
my $f = file( '/some/where/file.zip' );
if( $f->extension eq 'zip' )
{
    # do something
}

Module::Generic::File has a lot of features to handle and manipulate a file.

Jacques
  • 991
  • 1
  • 12
  • 15