The question is as follows, I upload logs to elasticsearch using filebeat and logstash.
"03.08.2020 10:56:38","Event LClick","Type Menu","t=0","beg"
"03.08.2020 10:56:38","Event LClick","Type Menu","Detail Impale","t=109","end"
"03.08.2020 10:56:40","Event LClick","t=1981","beg"
"03.08.2020 10:56:40","Event LClick","t=2090","end"
"03.08.2020 10:56:41","Event LClick","Type ToolBar","t=3026","beg"
"03.08.2020 10:56:44","Event FormActivate","Name SomeName","t=5444"
"03.08.2020 10:56:43","Event LClick","Type ToolBar","Detail Test","t=4477","end"
These are logs of actions performed by users in web forms. Each action has a beginning ("beg" at the end of a line) and an end ("end" at the end of a line).
I need to calculate the difference in time for which the user performed the action and output it as a field, if possible (even if it is zero).
Example: "03.08.2020 10:56:44" - "03.08.2020 10:56:41" = 3 seconds (This should be a new field)
Maybe I need to combine the fields somehow?
If there is a solution for subtracting dates inside the logstash, then how can I implement this for actions that have other actions between the beginning and the end, for example "Event FormActivate".
Maybe this is solved by certain queries already inside elasticsearch.
I am a complete newbie and would appreciate any help. My logstash config now:
input {
beats {
port => '5044'
}
}
filter {
mutate {
remove_field => [ '@version', 'input', 'host', 'ecs', 'agent' ]
remove_tag => [ 'beats_input_codec_plain_applied' ]
}
grok {
patterns_dir => ['./patterns']
match => { 'message' => '%{TIME:timestamp}(","Event\s)(?<event>([^"]+))(","Form\s)?(?<form>([^"]+))?(","ParentType\s)?(?<parent_type>([^"]+))?(","ParentName\s)?(?<parent_name>([^"]+))?(","Type\s)?(?<type>([^"]+))?(","Name\s)?(?<name>([^"]+))?(","Detail\s)?(?<detail>([^"]+))?(","t=)?(?<t>([\d]+))?' }
}
date {
match => [ 'timestamp', 'dd.MM.yyyy HH:mm:ss' ]
timezone => 'Europe/Moscow'
target => '@timestamp'
remove_field => 'timestamp'
}
mutate {
rename => ['log', 'user_path']
rename => ['@timestamp', 'logdate']
}
}
output {
elasticsearch {
hosts => ['localhost:9200']
index => 'test'
}
}
Update:
I tried to comprehend the actions in the thread suggested by Val. But I still didn't succeed. This is what I did with the logstash config:
filter {
grok {
patterns_dir => ['./patterns']
match => { 'message' => '%{TIME:timestamp}(","Event\s)(?<event>([^"]+))(","Form\s)?(?<form>([^"]+))?(","ParentType\s)?(?<parent_type>([^"]+))?(","ParentName\s)?(?<parent_name>([^"]+))?(","Type\s)?(?<type>([^"]+))?(","Name\s)?(?<name>([^"]+))?(","Detail\s)?(?<detail>([^"]+))?(","t=)?(?<t>([\d]+))?(",")?(?<status>(end|beg))?' }
add_tag => [ '%{status}' ]
}
date {
match => [ 'timestamp', 'dd.MM.yyyy HH:mm:ss' ]
}
elapsed {
unique_id_field => 'event'
start_tag => 'beg'
end_tag => 'end'
new_event_on_match => true
add_tag => ['1->2']
}
if '1->2' in [tags] and 'elapsed' in [tags] {
aggregate {
task_id => '%{event}'
code => 'map["report"] = [(event["elapsed_time"]*1000).to_i]'
map_action => 'create'
end_of_task => true
}
}
}
But it just doesn't work. It seems to me that I am very confused:(
Maybe if I show what I want to see in elasticsearch it will be better. For seven lines of logs (logs at the beginning of the post) it should look like this:
{
"username" => "I will get the username from the log path and I want it to get here too",
"elapsed_time" => date difference,
"event" => "event from line",
"elapsed_timestamp_start" => "start time"
}
From seven lines of logs in elasticsearch, there should be three such records. Please help me write a filter for this task. Thank you!
Another question to the documentation for the Aggregate filter plugin:
You should be very careful to set Logstash filter workers to 1 (-w 1 flag) for this filter to work correctly otherwise events may be processed out of sequence and unexpected results will occur.
I could not find an answer where I need to add this flag. Maybe that's the problem.