My goal was to create _id in elasticsearch that has the logging time in it - so that it will never be repeated even if the log is sent again through logstash
After throwing a few more hours at the problem - I have some conclusions that as far as I am concerned are not well enough documented, and recommended work around.
1) If the format of the log file has time zone in it - there is nothing that can be done to modify it in logstash. Therefore - don't waste time on timezones or partial matching or adding timezone. If the time has a Z at the end - then it will be GMT. I think that it is a bug that when this happens - no warning is issued.
2) Logstash outputs to standard output / file with the time in its local time regardless of the format of the input string.
3) Logstash uses the time in its local time - so concatenating the time into a variable gets messed up - even if the original string was GMT. so just don't even try to work with the @timestamp variable !!!
4) elastic search works in GMT - so it behaves properly. So what you see in the output of logstash as "@timestamp" => "2015-02-21T20:26:24.921-08:00" gets properly interpreted by elastic search as "@timestamp" => "2015-02-21T12:26:24.921Z"
So my work around is as follows:
1) keep the logs with a timestamp that is NOT @timestamp
2) consistently save time in the log files as GMT and mark them with trailing Z
3) use the date filter in its most basic form. No timezone attribute
filter {
date {
match => ["log_time", "YYYY-MM-dd'T'HH:mm:ss.SSSZ"]
#timezone => "Etc/GMT-8" <--- THIS DOES NOT WORK IF THERE IS A Z IN SOURCE
}
}
4) create time derivatives straight from the log variable - not from the @timestamp. e.g.
output {
stdout { codec => rubydebug }
elasticsearch {
host => localhost
document_id => "%{log_time}-%{host}" # <--- DO THIS
# document_id => "%{@timestamp}-%{host}" <--- DON'T DO THIS
}
}
If Jordan Sissel happens to read this - I believe that logstash should be consistent with elasticsearch as a default - or at least have an option to output and work internally in GMT. I had a rocky start doing what every one goes through when trying out the tool for the 1st time with existing logs.