-2

I am trying to clean up a rather large file system. I use the stat function to obtain the modification time of each file.

According to perldoc -f stat, the tenth element of the returned list is the last modified time in seconds since the epoch.

I use DateTime->from_epoch and subtract DateTime->now to calculate the age of the fule

    #!/usr/bin/perl

    use strict;
    use warnings;

    use DateTime;

    my $now = DateTime->now();
    #my $now = DateTime->now( time_zone => "America/New_York" );

    $self->{dir} = '/tmp/test';
    opendir(DIR, $self->{dir}) or die $@;
    my @files = grep(/\.txt$/, readdir(DIR));
    closedir(DIR);

    for ( @files ) {

            my $file = stat($self->{dir} . '/' . $_);
            my $mtime = DateTime->from_epoch(epoch => $file->mtime);
            #my $mtime = DateTime->from_epoch(epoch => $file->mtime, time_zone=> "America/New_York");
            my $elapsed = $now - $mtime;
            push(@{$self->{stale}}, {file => $self->{dir} . '/' . $_, mtime => $elapsed->in_units('minutes')}) if $elapsed->in_units('minutes') > 15;
            push(@{$self->{remove}}, {file => $self->{dir} . '/' . $_, mtime => $elapsed->in_units('days')}) if $elapsed->in_units('days') > 10;
    }

If I manually create test files and change the modification time, the result is off by 30 days

$ touch /tmp/test/test{100..104}.txt -d '-45 days'
$ perl MTIME.pm 
$VAR1 = {
          'mtime' => 15,
          'file' => '/tmp/test/test100.txt'
        }; $VAR1 = {
          'mtime' => 15,
          'file' => '/tmp/test/test104.txt'
        }; $VAR1 = {
          'mtime' => 15,
          'file' => '/tmp/test/test103.txt'
        }; $VAR1 = {
          'mtime' => 15,
          'file' => '/tmp/test/test101.txt'
        }; $VAR1 = {
          'mtime' => 15,
          'file' => '/tmp/test/test102.txt'
        };

I've tried DateTime objects both with and without the time zone set with no difference in results.

$ touch /tmp/test/test{100..104}.txt -d '-45 days'
$ touch /tmp/test/test{105..110}.txt
$ ll /tmp/test
total 11
-rw-r--r-- 1 root root    0 Apr  3 19:31 test100.txt
-rw-r--r-- 1 root root    0 Apr  3 19:31 test101.txt
-rw-r--r-- 1 root root    0 Apr  3 19:31 test102.txt
-rw-r--r-- 1 root root    0 Apr  3 19:31 test103.txt
-rw-r--r-- 1 root root    0 Apr  3 19:31 test104.txt
-rw-r--r-- 1 root root    0 May 18 19:30 test105.txt
-rw-r--r-- 1 root root    0 May 18 19:30 test106.txt
-rw-r--r-- 1 root root    0 May 18 19:30 test107.txt
-rw-r--r-- 1 root root    0 May 18 19:30 test108.txt
-rw-r--r-- 1 root root    0 May 18 19:30 test109.txt
-rw-r--r-- 1 root root    0 May 18 19:30 test110.txt

Working solution:

#!/usr/bin/perl

use strict;
use warnings 'all';

use Data::Dumper;

my $self = bless { }, 'My::Class';

my @files = glob '/tmp/test/*.txt';

for (@files) {
        my $days = int(-M $_);
        my $mins = int((time - (stat $_)[9]) / 60);
        my $item = {
                file  => $_,
                days => $days,
                minutes => $mins
        };
        push @{ $self->{remove} }, $item if $days > 10;
        push @{ $self->{stale} },  $item if $mins > 15;
}

print Dumper $self;
Mose
  • 541
  • 1
  • 11
  • 27
  • 1
    Have you considered `-M $file`? It's the difference between the script start time (so, "now") and the last modification time, what seems to be what you need. See [file tests (-X)](http://perldoc.perl.org/functions/-X.html). – zdim May 18 '17 at 23:43
  • 1
    Please review what you posted. The `$self->{dir} ...` shouldn't compile under `strict` (and makes no sense as it stands) – zdim May 18 '17 at 23:46
  • You will find that to get the full duration, just calling in_units('days') is not good enough. Check out the DateTime::Duration docs, you may be missing months or even years without realizing it. Working directly with epoch numbers and seconds may be easier here. – bytepusher May 19 '17 at 00:05
  • 1
    The code you posted is missing substantial chunks that would make it a complete, self-contained example. As it stands, it is gibberish. However, approximately 30 day delta seems to indicate to me that you are missing the fact that months are zero based in Perl or something like that. That's the first thing I would look for. – Sinan Ünür May 19 '17 at 00:09
  • If you create two DateTime objects a year apart, subtract them to get a duration, and call in_unit('days') on that, you get ... 0. That is due to DateTime maths and mentioned in the docs. – bytepusher May 19 '17 at 00:18
  • Note that `-M` defines days as `24*60*60` seconds, so `-M $file >= 10` doesn't quite check if a file is at least 10 days old, but it's probably close enough for the OP's purpose. – ikegami May 19 '17 at 01:28

2 Answers2

0

Aside from any other possible issues:

my $elapsed = $now - $mtime;
push(@{$self->{remove}}, {
    file => $self->{dir} . '/' . $_, 
    mtime => $elapsed->in_units('days')
}) 
if $elapsed->in_units('days') > 10;

does not do what you expect.

You could create a DateTime::Duration object of 10 days and compare to that. To do so, you would need a base datetime.

For example

my $base = DateTime->now;
my $ten_days = DateTime::Duration->new( days => 10 );

if ( DateTime::Duration->compare($elapsed,$ten_days,$base) == 1 ){
    push(@{$self->{remove}}, {
        file => $self->{dir} . '/' . $_, 
        mtime => $elapsed->in_units('days')
    }) 
}

I would suggest, though, to simply calculate the seconds in 10 days and see if the time elapsed is bigger than that, as it seems much easier, and stat returns epoch anyway.

bytepusher
  • 1,568
  • 10
  • 19
  • Or use `-M`, which does the math for you and takes its input in days. – Matt Jacob May 19 '17 at 00:48
  • 2
    Re "*I would suggest, though, to simply calculate the seconds in 10 days*", That's actually very complicated. `DateTime::Duration->compare($elapsed,$ten_days,$base) == 1` is indeed the simple way of doing that. What the OP could do, however, is to compare against `time-10*24*60*60`. It might not be exactly 10 days ago, but it's probably close enough for the OP. – ikegami May 19 '17 at 01:17
0

Your question is hard to understand because you are writing an object-oriented module and then running it as a program. The code you show won't compile, mainly because $self is never declared or defined. If you hope for useful answers then please post a complete program that we can run and that demonstrates the problem that you are asking about

I can't try your program and see the problem for myself, but there are two obvious improvements to make

  • It is much easier to make a call to glob than to open and read a directory, remove the files that you don't want, and rebuild the path to each file by adding back the directory

  • You can use the built-in -M operator to discover the age of a file in floating-point days

I've written this, which creates an empty object in class My::Class and adds data to it. It includes the ideas I''ve talked about, but like your own code provides no output. Hopefully you will understand how to interpret this into your own structure

The only problem you may have is that the mtime fields are floating-point days. You may want to apply int or POSIX::ceil depending on what those values are used for

use strict;
use warnings 'all';

my $self = bless { }, 'My::Class';

my @files = glob '/temp/text/*.txt';

for my $file ( @files ) {

    my $age = -M $file;

    if ( $age >= 10.0 ) {

        my $item = {
            file  => $file,
            mtime => $age,
        };

        push @{ $self->{remove} }, $item;

        push @{ $self->{stale} },  $item if $age >= 15.0;
    }
}
Borodin
  • 126,100
  • 9
  • 70
  • 144
  • Apologies for the lack of completeness in the example. The piece I posted here is 1 method in a module that contains several, and posting the whole thing would have been both bloated and time consuming to sanitize. For debug/testing I temporarily have the module doing __PACKAGE__->new->test to review output. I'll try your example tomorrow and see how it works, I hadn't considered glob. I'm not exactly a Perl pro, and most of my scripting doesn't deal with filesystems. Given the sensitive nature of the data I want to ensure I'm approaching this the best way possible. Will test 2morrow. – Mose May 19 '17 at 02:40
  • Also, I had initially started with -M, however $self->{stale} is storing age in minutes as opposed to days, whereas $self->{remove} is storing age in days. -M performs much better than leveraging DateTime, but I wasn't getting the results I had expected trying it (Dates/Times are a stumbling point for me) so I thought perhaps DateTime might be the safer option, albeit not the most ideal one. – Mose May 19 '17 at 02:50