I've been trying to set up a custom multiline log parser to get logs into Stackdriver with some readable fields. Currently it looks like this:
<source>
type tail
read_from_head true
path /root/ansible.log
pos_file /var/lib/google-fluentd/pos/ansible.pos
time_format "%a %b %e %T %Z %Y"
format multiline
format_firstline /Started ansible run at/
format1 /Started ansible run at (?<timestart>[^\n]+)\n(?<body>.*)/
format2 /PLAY RECAP.*/
format3 /ok=(?<ok>\d+)\s+changed=(?<changed>\d+)\s+unreachable=(?<unreachable>\d+)\s+failed=(?<failed>\d+).*/
format4 /Finished ansible run at (?<timeend>[^\n]+)/
tag ansible
</source>
It's done to the specifications at http://docs.fluentd.org/v0.12/articles/parser_multiline, and it works. But it works without a proper time stamp - timestart
and timeend
are just simple fields in the json. So in this current state, the time_format
setting is useless, because I don't have a time
variable among the regexes. This does aggregate all the variables I need, logs show up in Stackdriver when I run the fluend service, and all is almost happy.
However, when I change one of those time variables' name to time
, trying to actually assign a Stackdriver timestamp to the entry, it doesn't work. The fluentd log on the machine says that the worker started and parsed everything, but logs don't show up in the Stackdriver console at all.
timestart
and timeend
look like Fri Jun 2 20:39:58 UTC 2017
or something along those lines. The time format specifications are at http://ruby-doc.org/stdlib-2.4.1/libdoc/time/rdoc/Time.html#method-c-strptime and I've checked and double checked them too many times and I can't figure out what I'm doing wrong.
EDIT: another detail: when I try to parse out the time
variable, while the logs don't show up in the Stackdriver console, the appropriate tag (in this case ansible
) shows up in the list of tags. It's just that the results are empty.