2

I have a problem with following Perl program, which can be used to reorganize perform the trace of accesses to an application.

I've implemented following solution with jump rows function, because next in the future I could have 10 or more rotated files, of 50MB each.

I Want to skip the lines already read in previous processing (if the inode of the file has not changed), in this way I will work only with deltas.

I hope this code can help other users.

#!/usr/bin/perl

use strict;
use warnings 'all';

use File::Path qw<mkpath>;
use File::Spec;
use File::Copy;
use POSIX qw<strftime>;
use English;

# Dynamic Variables
my %older_count;
my %older_inode;
my @newer_filelist;
my @events;

my $OLD_IN_FILE = "";

# Static Variables
# Directories
my $IN_DIR               = "/tmp/appo/log";    # Input Directories
my $OUTPUT_LOG_DIRECTORY = "/tmp/appo/A14";    # Output directory

# Files
my $SPLITTED_OUTFILE = "parse_log.csv";           # Splitted by month output file
my $R_STATS          = ".rotation_statistics";    # Rotation Statistic file

## MAIN

# Loading old statistics
if (-e $R_STATS) { 
   open (STAT_FILE, $R_STATS) or die $!;

    while ( <STAT_FILE> ) {
       my @lines = split /\n/;
       my ( $file, $inode, $nrows ) = $lines[0] =~ /\A(.\w.*);(\d.*);(\d.*)/;    # Encapsulate values

       push @{ $older_count{$file} }, $nrows;
       push @{ $older_inode{$file} }, $inode;
   }

   close( STAT_FILE );
}

# Loading new events from log
foreach my $INPUT ( glob( "$IN_DIR/logrotate_*.log" ) ) {

    my $inode        = ( stat( $INPUT ) )[1];
    my $currentinode = $older_inode{$INPUT}[0];

    my $jumprow = 0;
    $jumprow = $older_count{$INPUT}[0] if $currentinode == $inode; 

# Get current file stastistics
   if ( $INPUT ne $OLD_IN_FILE ) {
       my $count = ( split /\s+/, `wc -l $INPUT` )[0];
       push @newer_filelist, {
             filename => $INPUT,
             inode    => $inode,
             count    => $count
       };
    }

    # Log opening
    open my $fh, '<', $INPUT or die "can't read open '$INPUT': $OS_ERROR";

    $/ = "\n\n";    # record separator

    while ( <$fh> ) {

        # next unless $. > $jumprow; # This instruction doesn't work

        # Log processing
        my @lines = split /\n/;
        my $i     = 0;

        foreach my $lines ( @lines ) {

            # Take only Authentication rows and skip others
            if ( $lines[$i] =~ m/\A#\d.\d.+#\d{4}\s\d{2}\s\d{2}\s\d{2}:\d{2}:\d{2}:\d{3}#\+\d+#\w+#\/\w+\/\w+\/Authentication/ ) {

                # Shows only LOGIN/LOGOUT access type and exclude GUEST users
                if ( $lines[ $i + 2 ] =~ m/Login/ || $lines[ $i + 2 ] =~ m/Logout/ && $lines[ $i + 3 ] !~ m/Guest/ ) {

                    my ( $y, $m, $d, $time ) = $lines[$i] =~ /\A#\d.\d.+#(\d{4})\s(\d{2})\s(\d{2})\s(\d{2}:\d{2}:\d{2}:\d{3})/;

                    my ( $action ) = $lines[ $i + 2 ] =~ /(\w+)/;
                    my ( $user )   = $lines[ $i + 3 ] =~ /\w+:\s(.+)/;

                    push @events, {
                        date   => "$y/$m/$d",
                        time   => $time,
                        action => $action,
                        user   => $user
                    };  # Array loader
                }
            }
            else {
                next;
            }

            $i++;
        }

        $OLD_IN_FILE = $INPUT;
    }
    close( $fh );
}

# Print Log statistics for futher elaborations
open( STAT_FILE, '>', $R_STATS ) or die $!;

foreach my $my_filelist ( @newer_filelist ) {
    print STAT_FILE join ';', $my_filelist->{filename}, $my_filelist->{inode}, "$my_filelist->{count}\n";
}

close( STAT_FILE );

my @by_user = sort { $a->{user} cmp $b->{user} } @events;    # Sorting by users

foreach my $my_list ( @by_user ) {

    my ( $y, $m ) = $my_list->{date} =~ /(\d{4})\/(\d{2})/;

    # Generate Directory YYYY-Month - #2009-January
    my $directory = File::Spec->catfile( $OUTPUT_LOG_DIRECTORY, "$m-$y" );

    unless ( -e $directory ) {
        mkpath( $directory, { verbose => 1 } );
    }

    my $log_file_path = File::Spec->catfile( $directory, $SPLITTED_OUTFILE );

    open( OUTPUT, '>>', $log_file_path ) or die $!;
    print OUTPUT join ';', $my_list->{date}, $my_list->{time}, $my_list->{action}, "$my_list->{user}\n";
}

close( OUTPUT );

My log files are

logrotate_1.0.log

#2.0^H#2018 05 29 10:09:45:969#+0200#Info#/Sys/Sec/Authentication#
#BC-JAS-SEC#security#C0000A7103EC9E50000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER1#5##C47731E44D00000bae##0#Thread[HTTP Worker [@1473726842],5,Dedicated_Application_Thread]#Plain##
Login
User: USER4
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

#2.0^H#2018 05 29 11:51:06:541#+0200#Info#/Sy/Sec/Authentication#
#BC-JAS-SEC#security#C0000A7103EC9F50000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER4#6##A40B81404D03c0bae##0#Thread[HTTP Worker [@1264376989],5,Dedicated_Application_Thread]#Plain##
Login
User: USER1
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

#2.0^H#2018 05 30 11:54:03:906#+0200#Info#/Sy/Sec/Informtion#
#BC-JAS-SEC#security#C0000A7103EC9F50000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER4#6##A40B81404D03c0bae##0#Thread[HTTP Worker [@1264376989],5,Dedicated_Application_Thread]#Plain##
Login
User: USER4
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

#2.0^H#2018 05 30 11:59:59:156#+0200#Info#/Sys/Sec/Authentication#
#BC-JAS-SEC#security#C0000A7103ECA0C#3935150000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER3#7##9ACF7Ec0bae##0#Thread[HTTP Worker [@124054179],5,Dedicated_Application_Thread]#Plain##
Logout
User: USER3
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

#2.0^H#2018 05 30 08:32:11:348#+0200#Warn#/Sys/Sec/Authentication#
#BC-JAS-SEC#security#C0000A7103ECA20E0000508C#3935150000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER2#03c0bae##0#Thread[HTTP Worker [@2033389552],5,Dedicated_Application_Thread]#Plain##
Login
User: USER4
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

#2.0^H#2018 05 30 11:09:54:978#+0200#Info#/Sys/Sec/Information#
#BC-JAS-SEC#security#C0000A7103ECA20E0000508C#3935150000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER2#03c0bae##0#Thread[HTTP Worker [@2033389552],5,Dedicated_Application_Thread]#Plain##
Login
User: USER2
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

#2.0^H#2018 06 01 08:11:30:008#+0200#Warn#/Sys/Sec/Authentication#
#BC-JAS-SEC#security#C0000A7103ECA20050000508C#3935150000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER2#0##E0E##0#Thread[HTTP Worker [@2033389552],5,Dedicated_Application_Thread]#Plain##
Logout
User: USER2
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

#2.0^H#2018 06 01 11:11:29:658#+0200#Info#/Sys/Sec/Information#
#BC-JAS-SEC#security#C0000A7103ECA20050000508C#3935150000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER2#0##E0E##0#Thread[HTTP Worker [@2033389552],5,Dedicated_Application_Thread]#Plain##
Logout
User: USER1
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

#2.0^H#2018 06 02 12:00:00:254#+0200#Info#/Sys/Sec/Authentication#
#BC-JAS-SEC#security#C0000A7103ECA20050000508C#3935150000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER2#0##E0E##0#Thread[HTTP Worker [@2033389552],5,Dedicated_Application_Thread]#Plain##
Logout
User: Guest
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

#2.0^H#2018 06 02 12:05:00:465#+0200#Warn#/Sys/Sec/Authentication#
#BC-JAS-SEC#security#C0000A7103ECA20050000508C#3935150000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER2#0##E0E##0#Thread[HTTP Worker [@2033389552],5,Dedicated_Application_Thread]#Plain##
Logout
User: USER9
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

#2.0^H#2018 06 02 12:50:00:065#+0200#Warn#/Sys/Sec/Authentication#
#BC-JAS-SEC#security#C0000A7103ECA20050000508C#3935150000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER2#0##E0E##0#Thread[HTTP Worker [@2033389552],5,Dedicated_Application_Thread]#Plain##
Login
User: USER9
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

#2.0^H#2018 05 24 10:43:38:683#+0200#Info#/Sys/Sec/Authentication#
#BC-JAS-SEC#security#C0000A7103EC9E50000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER1#5##C47731E44D00000bae##0#Thread[HTTP Worker [@1473726842],5,Dedicated_Application_Thread]#Plain##
Login
User: USER1
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

logrotate_0.0.log

#2.0^H#2018 05 24 11:05:04:011#+0200#Info#/Sy/Sec/Authentication#
#BC-JAS-SEC#security#C0000A7103EC9F50000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER4#6##A40B81404D03c0bae##0#Thread[HTTP Worker [@1264376989],5,Dedicated_Application_Thread]#Plain##
Login
User: USER4
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

#2.0^H#2018 05 24 11:04:59:410#+0200#Info#/Sy/Sec/Informtion#
#BC-JAS-SEC#security#C0000A7103EC9F50000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER4#6##A40B81404D03c0bae##0#Thread[HTTP Worker [@1264376989],5,Dedicated_Application_Thread]#Plain##
Login
User: USER4
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

#2.0^H#2018 05 24 11:05:07:100#+0200#Info#/Sys/Sec/Authentication#
#BC-JAS-SEC#security#C0000A7103ECA0C#3935150000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER3#7##9ACF7Ec0bae##0#Thread[HTTP Worker [@124054179],5,Dedicated_Application_Thread]#Plain##
Logout
User: USER3
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

#2.0^H#2018 05 24 11:07:21:314#+0200#Warn#/Sys/Sec/Authentication#
#BC-JAS-SEC#security#C0000A7103ECA20E0000508C#3935150000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER2#03c0bae##0#Thread[HTTP Worker [@2033389552],5,Dedicated_Application_Thread]#Plain##
Login
User: USER2
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

#2.0^H#2018 05 24 11:07:21:314#+0200#Info#/Sys/Sec/Information#
#BC-JAS-SEC#security#C0000A7103ECA20E0000508C#3935150000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER2#03c0bae##0#Thread[HTTP Worker [@2033389552],5,Dedicated_Application_Thread]#Plain##
Login
User: USER2
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

#2.0^H#2018 05 26 10:48:02:458#+0200#Warn#/Sys/Sec/Authentication#
#BC-JAS-SEC#security#C0000A7103ECA20050000508C#3935150000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER2#0##E0E##0#Thread[HTTP Worker [@2033389552],5,Dedicated_Application_Thread]#Plain##
Logout
User: USER2
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

#2.0^H#2018 05 28 10:00:25:000#+0200#Info#/Sys/Sec/Information#
#BC-JAS-SEC#security#C0000A7103ECA20050000508C#3935150000000004#common.com/irj#com.common.services.security.authentication.logincontext.table#USER2#0##E0E##0#Thread[HTTP Worker [@2033389552],5,Dedicated_Application_Thread]#Plain##
Logout
User: USER0
IP Address: 127.0.0.1
Authentication Stack: ticket
Authentication Stack Properties:

I have a problem using the statementat line 54:

#next unless $. > $jumprow;

I think it doesn't work because I use following record separator, but I don't understand what kind of separator I have to use for solve this problem:

$/ = "\n\n";  # record separator

For debug the code i've insert following statement:

print "next unless $. > $jumprow\n";

As i can see, $. value is not the same of row number of file (The cause is record separator with double new line ---> $/ = "\n\n";)

If i remove my double new line, script doesn't work

Details of my script: (1)First Step: Read STAT_FILE for see rows readed in last run

(2)Second Step: I encapsulate Date, Time, Action( login or logout) and User (if isn't Guest) into an array (@events). I Sort array by user (not by date as default).

(3)Third Step: I print into STAT_FILE information about my logfile readed

(4)Fourth Step: I Write sorted @event array into a file parse_log.csv in a directory named MM-YYYY (it depends from date of my event).

Could you help me to get a solution for my script please?

clarkseth
  • 229
  • 4
  • 16

1 Answers1

2

I thought we covered this yesterday.

if ( $currentinode == $inode ) {
    # Get rows to jump for this $INPUT
    my $jumprow = $older_count{$INPUT}[0];
}
else {
    # If file has been changed
    my $jumprow = 0;
}

Each of these blocks declares a new $jumprow variable. And each of those variables ceases to exist when you exit the block that they were declared in (i.e. on the very next line).

If you want to access these variables outside of the if/else blocks, then you need to declare them at a higher level.

my $jumprow;
if ( $currentinode == $inode ) {
    # Get rows to jump for this $INPUT
    $jumprow = $older_count{$INPUT}[0];
}
else {
    # If file has been changed
    $jumprow = 0;
}

Or (more simply):

my $jumprow = 0;
$jumprow = $older_count{$INPUT}[0] if $currentinode == $inode;

Or

my $jumprow = $currentinode == $inode ? $older_count{$INPUT}[0] : 0;
Dave Cross
  • 68,119
  • 3
  • 51
  • 97
  • I'm being particularly dense. Can you explain the question for me please Dave? – Borodin Aug 01 '18 at 15:54
  • @Dave Cross You're absolutely right, I had not thought about it. Immediately adopt your penultimate solution, it is really clean and smart – clarkseth Aug 01 '18 at 15:57
  • @Borodin: As I understand it, he's trying to avoid processing bits of the logfiles multiple times, so he's holding a counter recording the highest record number that was seen previously and skipping to that record on the next run. I think I've fixed his immediate problem, but there are higher level architectural problems here that I don't have time for. – Dave Cross Aug 01 '18 at 16:01
  • 1
    @Dave: Ah I see, I think. Thank you very much. This sounds like a case for `seek`: just `seek` to the length of the file when it was last processed. – Borodin Aug 01 '18 at 17:34
  • @DaveCross Do you have any idea, about why if i remove record separator ($/ = "\n\n"; ) my script doesn't work (it does not encapsulate the contents of lines anymore) – clarkseth Aug 02 '18 at 15:02
  • @clarkseth: That sounds like it should be a separate question, to be honest. – Dave Cross Aug 02 '18 at 15:08
  • @DaveCross ...my problem was ever been the separator :( (with jumprow). Your sugestion have made more smart my jumprow stack. I've work to my script, for perfectionate it, and i've ever update the code until now. That's all ... If you desire to semplificate my problem into a short question, i can delete this Q and create another one... Tell me what i have to do, you're the only user that give me an help until now on this question. – clarkseth Aug 02 '18 at 15:31
  • 1
    @clarkseth: Please don't consider deleting this question. Stack Overflow works best if questions and answers are left for other users to find. I can't simplify your problem into a new question for you, as I don't really understand it myself. You'll have to do that. – Dave Cross Aug 02 '18 at 15:52
  • @clarkseth: It's also worth pointing out that the very act of sitting down and trying to work out exactly what you want to ask, is often a good way to discover that you already know the answer :-) – Dave Cross Aug 02 '18 at 15:53
  • @DaveCross Ah Ok, I did not understand that you didn't understand. I provide a new question. Thank you – clarkseth Aug 02 '18 at 15:54
  • @DaveCross ahahah yeah, usually after a good question, we find a correct answer alone :) – clarkseth Aug 02 '18 at 15:55
  • @DaveCross I have insert new question :) as you suggested https://stackoverflow.com/questions/51669156/perl-encapsulate-data-from-file-with-regex-exprex-doesnt-work-without – clarkseth Aug 03 '18 at 09:11