I have a question, which good solutions (software/hardware) have been developed and applied in enterprise for online failure prediction? Zabbix, Openstb, Cacti and similar alternatives ? Can you list some more? Can you describe what advantages and disadvantages they have, spefically in failure prediction aspect ?
I want to know the disadvantages of them and make some improvement by model\algorithms. If you don't know much about the concept of Online failure prediction, please reference the following description. If you already know it, just skip it.
Online failure prediction -- It is an approach to evaluate whether an incoming failure will occur in the near future, and when the failure will occur, and in which component (maybe software or hardware) the failure will occur. It's a short-term prediction by tracking failure, detected error reporting, undetected errors' symptoms, faults's auditing (actively searching the faults, for example, search inodes' inconsistency in Linux filesystems).
A much more detailed introduction and relevant approaches is described in the paper, https://s3-us-west-2.amazonaws.com/mlsurveys/88.pdf
Thank you very much !