How to produce rows for non-existing time buckets?

Question

I have produced a table like this:

+----------+---------+---------+-------+
| _time    | field_1 | field_2 | count |
+----------+---------+---------+-------+
| 08:00:00 | A       | 1       | 2     |
+----------+---------+---------+-------+
| 08:00:00 | B       | 1       | 4     |
+----------+---------+---------+-------+
| 08:00:03 | B       | 3       | 1     |
+----------+---------+---------+-------+
| 08:00:03 | A       | 2       | 3     |
+----------+---------+---------+-------+

I want to know: what's average & maximum count, per (field_1+field_2) combination, per second. The problem is the _time is missing some seconds, so the stats count result only give me the aggregated results on existing time buckets.

How can I expand this table to include every missing time seconds, just by filling count=0, for each (field_1+field_2) combination? As long as I can do this, I can get the result simply by stats avg(count) max(count) by field_1 field_2.

score 1 · Answer 1 · answered Mar 05 '18 at 16:17

I also posted my question here: https://answers.splunk.com/answers/624043/how-to-produce-rows-for-non-existing-time-buckets.html, and I got a nice answer to fill the missing timeslots.

<your base search> 
| eval field= field_1 + "|" + field_2 
| timechart limit=0 span=1s count by field
| fillnull value=0
| untable _time field count 
| eval temp = split(field,"|") 
| eval field_1 = mvindex(temp,0) 
| eval field_2 = mvindex(temp,1) 
| stats sum(count) as count by field_1 field_2 _time | sort- _time

score 0 · Answer 2 · answered Mar 05 '18 at 15:41

It will be difficult to debug without data. But instead of adding an extra entry for each missing second for each field1/field2 combination, you can do something like:

your_query
| eventstats earliest(_time) as etime, latest(_time) as ltime by field1, field2
| stats sum(count) as c, max(count) as maxCount, values(etime) as etime, values(ltime) as ltime by field1, field2
| eval avgCount = c/(ltime-etime)
| table field1, field2, avgCount, maxCount

This will find the latest time and the earliest time for each field1+field2 combination. Then, the average is calculated manually by dividing the sum of all counts for a field1+field2 combination by the number of seconds (latest-earliest).

Thanks, this will do the work too. I also found a way to fill the missing time slots, I will post it as well. — Yong Li, Mar 05 '18 at 16:03

How to produce rows for non-existing time buckets?

2 Answers2