1

Not sure how to ask this question, thus I don't know how to search for it on google or SO. Let me just show you the given data. By the way, this is just an Awk exercise, its not homework. Been trying to solve this off and on for 2 days now. Below is an example;

Mon Sep 15 12:17:46 1997
User-Name = "wynng"
NAS-Identifier = 207.238.228.11
NAS-Port = 20104
Acct-Status-Type = Start
Acct-Delay-Time = 0
Acct-Session-Id = "239736724"
Acct-Authentic = RADIUS
Client-Port-DNIS = "3571800"
Framed-Protocol = PPP
Framed-Address = 207.238.228.57

Mon Sep 15 12:19:40 1997
User-Name = "wynng"
NAS-Identifier = 207.238.228.11
NAS-Port = 20104
Acct-Status-Type = Stop
Acct-Delay-Time = 0
Acct-Session-Id = "239736724"
Acct-Authentic = RADIUS
Acct-Session-Time = 115
Acct-Input-Octets = 3915
Acct-Output-Octets = 3315
Acct-Input-Packets = 83
Acct-Output-Packets = 66
Ascend-Disconnect-Cause = 45
Ascend-Connect-Progress = 60
Ascend-Data-Rate = 28800
Ascend-PreSession-Time = 40
Ascend-Pre-Input-Octets = 395
Ascend-Pre-Output-Octets = 347
Ascend-Pre-Input-Packets = 10
Ascend-Pre-Output-Packets = 11
Ascend-First-Dest = 207.238.228.255
Client-Port-DNIS = "3571800"
Framed-Protocol = PPP
Framed-Address = 207.238.228.57

So the log file contains the above data for various users. I specifically pasted this to show that this user had a login, Acct-Status-Type = Start, and a logoff, Acct-Status-Type = Stop. This counts as one session. Thus I need to generate the following output.

User:           "wynng"
Number of Sessions: 1
Total Connect Time: 115
Input Bandwidth Usage:  83
Output Bandwidth Usage: 66

The problem I have is keeping the info somehow attached to the user. Each entry in the log file has the same information when the session is in Stop so I cant just regex

/Acct-Input-Packets/{inPackets =$3}

/Acct-Output-Packets/{outPackets = $3}

Each iteration through the data will overwrite the past values. What I want to do is if I find a User-Name entry and this entry has a Stop, then I want to record for that user, the input/output packet values. This is where I get stumped.

For the session values I was thinking of saving the User-Names in an array and then in the END{} count the duplicates and divide by 2 those that are greater than 2 if even. If odd then divide by two then floor it.

I don't necessarily want the answer but maybe some hints/guidance or perhaps a simple example on which I could expand on.

1 Answers1

1

You can check each line for :

  • a date pattern : /\w+\s\w+\s[0-9]{2}\s[0-9]{2}:[0-9]{2}:[0-9]{2}\s[0-9]{4}/
  • user name value : /User-Name\s+=\s+\"\w+\"/
  • status value : /Acct-Status-Type\s+=\s+\w+/
  • input packet value : /Acct-Input-Packets\s+=\s[0-9]+/
  • output packet value : /Acct-Output-Packets\s+=\s[0-9]+/
  • an empty line : /^$/

Once you have defined what you are looking for (above pattern), it's just a matter of conditions and storing all those data in some array.

In the following example, I store each value type above in a dedicated array for each type with a count index that is incremented when an empty line /^$/ is detected :

awk 'BEGIN{
    count = 1;
    i = 1;
}{
    if ($0 ~ /\w+\s\w+\s[0-9]{2}\s[0-9]{2}:[0-9]{2}:[0-9]{2}\s[0-9]{4}/){
        match($0, /\w+\s(\w+)\s([0-9]{2})\s([0-9]{2}):([0-9]{2}):([0-9]{2})\s([0-9]{4})/, n);
        match("JanFebMarAprMayJunJulAugSepOctNovDec",n[1])
        n[1] = sprintf("%02d",(RSTART+2)/3);
        arr[count]=mktime(n[6] " " n[1] " " n[2] " " n[3] " " n[4] " " n[5]);
        order[i]=count;
        i++;
    }
    else if ($0 ~ /User-Name\s+=\s+\"\w+\"/){
        match($0, /User-Name\s+=\s+\"(\w+)\"/, n);
        name[count]=n[1];
    }
    else if ($0 ~ /Acct-Status-Type\s+=\s+\w+/){
        match($0, /Acct-Status-Type\s+=\s+(\w+)/, n);
        status[count]=n[1];
    }
    else if ($0 ~ /^$/){
        count++;
    }
    else if ($0 ~ /Acct-Input-Packets\s+=\s[0-9]+/){
        match($0, /Acct-Input-Packets\s+=\s([0-9]+)/, n);
        input[count]=n[1];
    }
    else if ($0 ~ /Acct-Output-Packets\s+=\s[0-9]+/){
        match($0, /Acct-Output-Packets\s+=\s([0-9]+)/, n);
        output[count]=n[1];
    }
}
END{
    for (i = 1; i <= length(order); i++) {

        val = name[order[i]];

        if (length(user[val]) == 0) {

            valueStart = "0";

            if (status[order[i]] == "Start"){
                valueStart = arr[order[i]];
            }
            user[val]= valueStart "|0|0|0|0";
        }
        else {
            split(user[val], nameArr, "|");

            if (status[order[i]]=="Stop"){
                nameArr[2]++;
                nameArr[3]+=arr[order[i]]-nameArr[1]
            }
            else if (status[order[i]] == "Start"){
                # store date start
                nameArr[1] = arr[order[i]];
            }

            nameArr[4]+=input[order[i]];

            nameArr[5]+=output[order[i]];

            user[val]= nameArr[1] "|" nameArr[2] "|" nameArr[3] "|" nameArr[4] "|" nameArr[5];
        }
    }

    for (usr in user) {
        split(user[usr], usrArr, "|");
        print "User: " usr;
        print "Number of Sessions: " usrArr[2];
        print "Total Connect Time: " usrArr[3];
        print "Input Bandwidth Usage: " usrArr[4];
        print "Output Bandwidth Usage: " usrArr[5];
        print "------------------------";

    }
}' test.txt

The values are extracted with match function like :

match($0, /User-Name\s+=\s+\"(\w+)\"/, n);

For the date, we have to parse the month string part, I've used the solution in this post to extract like :

match($0, /\w+\s(\w+)\s([0-9]{2})\s([0-9]{2}):([0-9]{2}):([0-9]{2})\s([0-9]{4})/, n);
match("JanFebMarAprMayJunJulAugSepOctNovDec",n[1])
n[1] = sprintf("%02d",(RSTART+2)/3);

All the processing of the collected values is done in the END clause where we have to group the values, I create a user array with the username as key and as value a concatenation of all your different type delimited by | :

[startDate] "|" [sessionNum] "|" [connectionTime] "|" [inputUsage] "|" [outputUsage]

With this data input (your data extended), it gives :

User: TOTO
Number of Sessions: 1
Total Connect Time: 114
Input Bandwidth Usage: 83
Output Bandwidth Usage: 66
------------------------
User: wynng
Number of Sessions: 2
Total Connect Time: 228
Input Bandwidth Usage: 166
Output Bandwidth Usage: 132
------------------------
Community
  • 1
  • 1
Bertrand Martel
  • 42,756
  • 16
  • 135
  • 159
  • Whoa bro!!!!! Thanks for the in depth explanation and solution. Took me a bit to figure out how you were able to match the data with a specific user (I read and tried to decipher the script before I read your explanations; a skill I was told to work on). This tells me I still lack some insight and thus need more practice and maybe more time in the man pages. Thanks again!!!! – user3299549 Mar 06 '17 at 14:13