How to parse audit.log using logstash

Question

I want to use logstash to collect a log file, and the format of the file was like this:

type=USER_START msg=audit(1404170401.294:157): user pid=29228 uid=0 auid=0 ses=7972 subj=system_u:system_r:crond_t:s0-s0:c0.c1023 msg='op=PAM:session_open acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'

Which filter should i use to match the line? or there is another way to handle it.

Any help would be appreciated.

Used the pattern below to match the line with grok debugger , but still got a No matches message.

type=%{WORD:audit_type} msg=audit\(%{NUMBER:audit_epoch}:%{NUMBER:audit_counter}\): user pid=%{NUMBER:audit_pid} uid=%{NUMBER:audit_uid} auid=%{NUMBER:audit_audid} subj=%{WORD:audit_subject} msg=%{GREEDYDATA:audit_message}

But when i removed subj=%{WORD:audit_subject} msg=%{GREEDYDATA:audit_message}, it successed and got a JSON object like this.

{
  "audit_type": [
    [
      "USER_END"
    ]
  ],
  "audit_epoch": [
    [
      "1404175981.491"
    ]
  ],
  "BASE10NUM": [
    [
      "1404175981.491",
      "524",
      "1465",
      "0",
      "0"
    ]
  ],
  "audit_counter": [
    [
      "524"
    ]
  ],
  "audit_pid": [
    [
      "1465"
    ]
  ],
  "audit_uid": [
    [
      "0"
    ]
  ],
  "audit_audid": [
    [
      "0"
    ]
  ]
}

Don't know why subj and msg can't work on.

score 7 · Answer 1 · answered Dec 04 '15 at 19:42

The audit logs are written as a series of key=value pairs which are easily extracted using the kv filter. However I have noticed that the key msg is sometimes used twice and is also a series of key=value pairs.

First grok is used to get the fields audit_type, audit_epoch, audit_counter and sub_msg(the 2nd msg field)

grok {
  pattern => [ "type=%{DATA:audit_type}\smsg=audit\(%{NUMBER:audit_epoch}:%{NUMBER:audit_counter}\):.*?( msg=\'(?<sub_msg>.*?)\')?$" ]
  named_captures_only => true
}

kv is used to extract all of the key=value pairs except for msg and type since we have already obtained that data with grok:

kv {
  exclude_keys => [ "msg", "type" ]
}

kv is used again to parse the key=value pairs in sub_msg ( if it exists ):

kv {
  source => "sub_msg"
}

date is used to set the date to the value in audit_epoch, using the date format UNIX will parse float or integer timestamps:

date {
  match => [ "audit_epoch", "UNIX" ]
}

Lastly mutate is used to remove redundant fields:

mutate {
  remove_field => ['sub_msg', 'audit_epoch']
}

You could also rename fields like sysadmin1138 suggested:

mutate {
  rename => [
    "auid", "uid_audit",
    "fsuid", "uid_fs",
    "suid", "uid_set",
    "ses", "session_id"
  ]
  remove_field => ['sub_msg', 'audit_epoch']
}

All combined the filter looks like this:

filter {
  grok {
    pattern => [ "type=%{DATA:audit_type}\smsg=audit\(%{NUMBER:audit_epoch}:%{NUMBER:audit_counter}\):.*?( msg=\'(?<sub_msg>.*?)\')?$" ]
    named_captures_only => true
  }
  kv {
    exclude_keys => [ "msg", "type" ]
  }
  kv {
    source => "sub_msg"
  }
  date {
    match => [ "audit_epoch", "UNIX" ]
  }
  mutate {
    rename => [
      "auid", "uid_audit",
      "fsuid", "uid_fs",
      "suid", "uid_set",
      "ses", "session_id"
    ]
    remove_field => ['sub_msg', 'audit_epoch']
  }
}

This still works for filebeat version 7.11.1 – Evan Gertis Feb 24 '21 at 18:50 — Evan Gertis, Feb 24 '21 at 18:50

score 4 · Accepted Answer · answered Jul 01 '14 at 08:40

A quick search finds this on github

AUDIT type=%{WORD:audit_type} msg=audit\(%{NUMBER:audit_epoch}:%{NUMBER:audit_counter}\): user pid=%{NUMBER:audit_pid} uid=%{NUMBER:audit_uid} auid=%{NUMBER:audit_audid} subj=%{WORD:audit_subject} msg=%{GREEDYDATA:audit_message} 
AUDITLOGIN type=%{WORD:audit_type} msg=audit\(%{NUMBER:audit_epoch}:%{NUMBER:audit_counter}\): login pid=%{NUMBER:audit_pid} uid=%{NUMBER:audit_uid} old auid=%{NUMBER:old_auid} new auid=%{NUMBER:new_auid} old ses=%{NUMBER:old_ses} new ses=%{NUMBER:new_ses}

A cursory review suggests it's probably what you're looking for.

Thanks a lot. It is very helpful. Still some errors but i think i can overcome it. — txworking, Jul 01 '14 at 09:39

sysadmin1138 · Answer 3 · 2015-11-04T00:35:51.337

A better solution than grok may be to use the kv filter. This parses fields configured in "key=value" format, which most audit-log entres are. Unlike Grok, this will handle strings with sometimes-there-sometimes-not fields. However, the field-names are in their less-useful short-forms, so you may need to do some field-renaming.

filter { 
  kv { }
}

That would get you most of it, and the fields would match what shows up in the logs. All the data-types would be string. To go to all the trouble to humanize the fields:

filter {
  kv { }
  mutate {
    rename => { 
      "type" => "audit_type"
      "auid" => "uid_audit"
      "fsuid => "uid_fs"
      "suid" => "uid_set"
      "ses" => "session_id"
    }
  }
}

The msg field, which contains the timestamp and event-Id, will still need to be grokked, though. The other answers show how to do that.

filter {
  kv { }
  grok {
    match => { "msg" => "audit\(%{NUMBER:audit_epoch}:%{NUMBER:audit_counter}\):"
  }
  mutate {
    rename => { 
      "type" => "audit_type"
      "auid" => "uid_audit"
      "fsuid => "uid_fs"
      "suid" => "uid_set"
      "ses" => "session_id"
    }
  }
}

score 2 · Answer 4 · answered May 13 '15 at 07:26

the format for grok has changed, so have a look at this:

filter {
    grok {
        # example: type=CRED_DISP msg=audit(1431084081.914:298): pid=1807 uid=0 auid=1000 ses=7 msg='op=PAM:setcred acct="user1" exe="/usr/sbin/sshd" hostname=host1 addr=192.168.160.1 terminal=ssh res=success'
        match => { "message" => "type=%{WORD:audit_type} msg=audit\(%{NUMBER:audit_epoch}:%{NUMBER:audit_counter}\): pid=%{NUMBER:audit_pid} uid=%{NUMBER:audit_uid} auid=%{NUMBER:audit_audid} ses=%{NUMBER:ses} msg=\'op=%{WORD:operation}:%{WORD:detail_operation} acct=\"%{WORD:acct_user}\" exe=\"%{GREEDYDATA:exec}\" hostname=%{GREEDYDATA:hostname} addr=%{GREEDYDATA:ipaddr} terminal=%{WORD:terminal} res=%{WORD:result}\'" }
    }
    date {
        match => [ "audit_epoch", "UNIX_MS" ]
    }
}

This uses the date from audit_epoch as @datetime.

How to parse audit.log using logstash

4 Answers4