5

I've got multiple access logs in the logs directory, following the naming convention below:

access.log.1284642120
access.log.1284687600
access.log.1284843260

Basically, the logs are "rotated" by Apache per day, so they can be sorted in order.

I am trying to "read them one after another", so that they can be treated as one log file.

my @logs = glob('logs/access.log.*');

The above code will glob all the logs, but I am not sure:

  • In which order will the logs be organized, alphabetically?
  • if I want to check "the latest access time from an unique IP", how could I do this?

I have a Perl script that can read a single access log and check this easily (my algorithm is to have a big hash which uses IP address as the key and the access time as the value, and just keep pushing key/value pairs to it...). But I don't want to just merge all access files into one temporary file just for this process.

Any suggestions? Many thanks in advance.

Michael Mao
  • 9,878
  • 23
  • 75
  • 91

2 Answers2

11

If you want to ensure a particular order, sort it yourself, even if just to assure yourself that it will come out right:

 my @files = sort { ... } glob( ... );

In this case, where the filenames are all the same except for the particular digits, you might not need the sort block:

 my @files = sort glob( ... );

To read them as one über-file, I like to use a local @ARGV so I can use the diamond operator, which is really just the magic ARGV filehandle. When it gets to the end of one file in @ARGV, it moves on to the next. This fakes specifying all the files on the command line by assigning to @ARGV inside the program:

 {
 local @ARGV = sort { ... } glob( ... );

 while( <> ) {
      ...;
      }
 }

If you need to know the file you are currently processing, look in $ARGV.

If you need something more fancy, you might have to resort to brute force.

brian d foy
  • 129,424
  • 31
  • 207
  • 592
  • 1
    +1 for punctuation. Like metal bands, SO answers are better with umlauts. – FMc Sep 18 '10 at 12:09
  • You also get the magic of `$.` keeping track of the current line number of the current file. – mob Sep 18 '10 at 18:19
2

In a Unix-y environment, you can leverage the shell to group your files together:

my @files = glob("$dir/access.log.*");
open my $one_big_logfile, "-|", "cat @files" or die ...;
while (<$one_big_logfile>) {
   ...
}
mob
  • 117,087
  • 18
  • 149
  • 283