1

It seems that for any given "event" that auditd picks up, there are on the order of four log lines added to the auditd log.

Is there any predictable pattern that can be used to group log file lines into a single event? Specifically, I'm looking for something that denotes the start and end of an event.

For example, it seems like "type=SYSCALL" denotes the start of an event. But auditd docs I've found show that there are a ton -- a ton, I tell ya -- of different record types and the implication to me is that "SYSCALL" might not always be the indicator of an event.

Even more specifically, I'm asking this because I am using Sumologic to analyze my logs and they have a regex-based way of grouping multi-line log data into a single event. I will be asking them this question as well, but since this is more a question about auditd than it is about Sumologic, I thought it would be useful to ask to this community.

JDS
  • 2,598
  • 4
  • 30
  • 49

2 Answers2

1

I would recommend you use the ausearch utility to pre-process the logs before sending them to an analysis capability. Firstly, it inserts a event separator (four -'s) and it can optionally interpret the data to make it more readable (convert hex strings into text again, interpret uid's into usernames and you can use it's checkpointing capability to only gain new events on successive invocations (see the man page). You can also possibly use it to select events of interest as well.

I would point out Linux auditd events are quite complex and care should be taken in deciding what you need for analysis. Ideally, you should store the original logs (after ausearch -i) in a data store that can also normalize them into events for passing to an analysis capability or multiple analysis capabilities (in case one product doesn't answer all your questions).

BurnA
  • 421
  • 3
  • 6
  • is there a way to pre-process in real time that will ensure that (a) I don't have duplicate log entries and (b) I don't *miss* log entries? – JDS Mar 04 '16 at 17:51
  • You could use audispd (to get a copy of the logs) and write your own processor (see auparse-feed). But I wouldn't unless you have a defined real time need for the data for you are basically re-writing elements of auditd and ausearch to save a few MB's and possibly introducing loss of log entries (new code == possible new bugs). – BurnA Mar 04 '16 at 21:19
  • I want to point out that this isn't just about real-time, but also enabling streaming log processing. The days of batch processing are over. – blueben Nov 30 '17 at 22:06
0

Is there any predictable pattern that can be used to group log file lines into a single event? Specifically, I'm looking for something that denotes the start and end of an event.

Yes, event ID. The spec states:

An audit event is all records that have the same host (node), timestamp, and serial number. Each event on a host (node) has a unique timestamp and serial number. An event is composed of multiple records which have information about different aspects of an audit event. Each record is denoted by a type which indicates what fields will follow.

"records" == log lines. In the following example the msg field gives a millisecond timestamp followed by an event id (msg=audit(1605786316.840:248065)):

node=master type=SYSCALL msg=audit(1605786316.840:248065): arch=c000003e syscall=90 success=yes exit=0 ...
node=master type=CWD msg=audit(1605786316.840:248065): cwd="/home/sam"
node=master type=PATH msg=audit(1605786316.840:248065): item=0 inode=16075850 dev=fe:01 mode=0100644 o...
node=master type=PROCTITLE msg=audit(1605786316.840:248065): proctitle="/tmp/test"

Now, an open question I couldn't find an answer to are whether records from the same event are guaranteed to be contiguous. It would be extremely surprising if they weren't and I've never actually observed it, but AFAIK it's not been explicitly stated anywhere in the docs. In fact, the textual log format seems kind of under specified from what I can gather, so you are safer using ausearch/auparse or if your game the binary libaudit C interface (gross and tedious ...).

spinkus
  • 188
  • 2
  • 16