0

I've been trying to set up a custom multiline log parser to get logs into Stackdriver with some readable fields. Currently it looks like this:

<source>
    type tail
    read_from_head true
    path /root/ansible.log
    pos_file /var/lib/google-fluentd/pos/ansible.pos
    time_format "%a %b %e %T %Z %Y"
    format multiline
    format_firstline /Started ansible run at/
    format1 /Started ansible run at (?<timestart>[^\n]+)\n(?<body>.*)/
    format2 /PLAY RECAP.*/
    format3 /ok=(?<ok>\d+)\s+changed=(?<changed>\d+)\s+unreachable=(?<unreachable>\d+)\s+failed=(?<failed>\d+).*/
    format4 /Finished ansible run at (?<timeend>[^\n]+)/
    tag ansible
</source>

It's done to the specifications at http://docs.fluentd.org/v0.12/articles/parser_multiline, and it works. But it works without a proper time stamp - timestart and timeend are just simple fields in the json. So in this current state, the time_format setting is useless, because I don't have a time variable among the regexes. This does aggregate all the variables I need, logs show up in Stackdriver when I run the fluend service, and all is almost happy.

However, when I change one of those time variables' name to time, trying to actually assign a Stackdriver timestamp to the entry, it doesn't work. The fluentd log on the machine says that the worker started and parsed everything, but logs don't show up in the Stackdriver console at all.

timestart and timeend look like Fri Jun 2 20:39:58 UTC 2017 or something along those lines. The time format specifications are at http://ruby-doc.org/stdlib-2.4.1/libdoc/time/rdoc/Time.html#method-c-strptime and I've checked and double checked them too many times and I can't figure out what I'm doing wrong.

EDIT: another detail: when I try to parse out the time variable, while the logs don't show up in the Stackdriver console, the appropriate tag (in this case ansible) shows up in the list of tags. It's just that the results are empty.

snetch
  • 458
  • 4
  • 18

1 Answers1

0

You're correct that the Stackdriver logging agent looks for the timestamp in the 'time' field, but it uses Ruby's Time.iso8601 to parse that value (falling back on Time.at on error). The string you quoted (Fri Jun 2 20:39:58 UTC 2017) is not in either of those formats, so it fails to parse it (you could probably see the error in /var/log/google-fluentd/google-fluentd.log). You could add a record_transformer plugin to your config to change your parsed date to the right format (hint: enable_ruby is your friend). Something like:

<filter foo.bar>
  @type record_transformer
  enable_ruby
  <record>
    time ${Time.strptime(record['time'], '%a %b %d %T %Z %Y').iso8601}
  </record>
</filter>

should work...

Igor Peshansky
  • 727
  • 4
  • 12
  • Except, as I said, I don't see any errors in the fluentd log, the fluentd log says everything is ok, it's just that the logs don't show up in Stackdriver. And as for the time format, I'm already trying to use `%a %b %e %T %Z %Y` in the above block, so I'm just wondering why my method isn't working. – snetch Jul 14 '17 at 15:49