0

This question should go in the category: "a practical, answerable problem that is unique to software development".

Having a system that receives time-based events in the following form (simplified for clarity):

2020-04-10T11:00:00.000Z:system1,UP
2020-04-15T07:00:00.000Z:system1,DOWN
2020-04-20T09:15:00.000Z:system2,UP
2020-05-10T06:30:00.000Z:system3,UP
2020-05-15T22:30:00.000Z:system3,DOWN
2020-05-16T08:30:00.000Z:system3,UP

And imagine we graph this time series on a chart where:

  • X-axis: time
  • Y-axis: state (DOWN/UP which we can give values 1/0). Note that the values are reversed in order for the mathematical graph area in this question to make sense.

How can I save these events to a database (or equivalent) and compute the area between the X-axis and the graph (called numerical integration)? The start and end boundaries for the computation can be a full calendar month or the fragment of current month so far.

Example requirement

                                | <---- compute interval ----> |
                 |---------------------------------------------       <= DOWN
                 |               xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
S1 . . .|--------|. . . . . . . . . . . . . . . . . . . . . . . . . . <= UP
                                |                              |

S2 . . . . . . . . . . . .|------------------------------------ . . .
                                |                              |
                                                 |--|
                                                 |xx|
S3 . . . . . . . . . . . . . . . . . . .|--------|  |---------- . . .
                                |                              |

time:  10Apr    15Apr    20Apr 1May   10May   15May 16May   20May/now

Compute the amount of time (marked with xxx in the graphs above) each of the systems were unavailable in May considering the current date 2020-05-20. Note that:

  • system1 and system2 didn't even receive any events in May. They are "historically" DOWN and UP respectively. This case can go indefinitely long in the past.
  • system3 had a 10 hour interruption in May

And the result should be:

  • system1: X minutes (all minutes so far)
  • system2: 0 minutes
  • system3: 600 minutes

What I tried so far

  • I have studied a bit Elasicsearch and this use case does not come as an easy task to solve (one would probably have to consider transform APIs and several aggregations)
  • a potential solution with relational databases could be to save each new event with two timestamps (begin, end), and upon saving, to search the history and close the previous open end event for the same system. Then comes the collection of these intervals and intersecting them with the period/month in question, etc.
Nico Schlömer
  • 53,797
  • 27
  • 201
  • 249
Gabriel Petrovay
  • 20,476
  • 22
  • 97
  • 168
  • The computation is so trivial that it's hard to understand what the issue is. Why don't you just store all UP/DOWN events for all systems? – Nico Schlömer May 17 '20 at 19:56
  • There are several reasons one would save them: A) troubleshooting and replaying events; B) over time there might be other types of events (more that just a boolean value) that need to be reported (WARNING, DEGRADED, etc.) C) compliance reasons: (the event producers keep events for 30 days while we are responsible to be traceable 1 year) – Gabriel Petrovay May 18 '20 at 16:38

0 Answers0