6

I have a munin setup running and I'd like to leave my munin-node setup untouched while getting a longer and more detailed view of the logged data. I want to keep all logged data indefinite. An ideal solution would use something like the Annotated Time Line widget so that I could zoom in to any point in the history.


Edit: I've already found out that munin uses a lossy database so I'm expecting I'll need something that replaces it; i.e. unless I'm mistaken, any answer that doesn't replace Munin is most likely not useful to me.

What I'm hoping for is a drop in replacement for munin that can read the appropriate sections of the munin config files (e.g. the addresses of all the munin-nodes) and won't require any modification at all to the munin-node installs

BCS
  • 1,065
  • 2
  • 15
  • 24
  • I recently did "read up on status / monitoring / trending" compare and contrast and decided to use collectd for the majority of our systems monitoring. You can output CSV files and rrd at the same time; my plan is to export the CSV into something like graphite after I've got all the rest of everything worked out. – chris May 10 '11 at 16:54
  • @chris: please move to an answer – BCS May 10 '11 at 17:14
  • @chris Graphite stores its data in whisper files, which are very similar to RRD files. How is storing metrics in whisper by way of CSV any better than storing it directly in RRD? http://graphite.wikidot.com/whisper – sciurus May 10 '11 at 18:06
  • @sciurus: First let me say that as of right now, I haven't actually finished this so I may change it around completely... But my feeling is that if I've got both the RRD and a raw CSV available, I'll have more options for a graphing back-end than if I only had one. And I suspect I'd have an easier importing / feeding a CSV into whisper than an RRD, or at worst, they'd be equally hard / easy. – chris May 10 '11 at 18:16
  • @chris good point. FYI, graphite can read and draw graphs from RRD files, so you don't necessarily need to convert anything to whisper. https://answers.launchpad.net/graphite/+question/38488 – sciurus May 10 '11 at 18:30

4 Answers4

2

Munin, like every tool of its type that I'm aware of, uses round robin database, or RRD, files to store its data. Here is an explanation of the basics of RRD. An RRD file is made up of Round Robin Archives, or RRAs. An RRA is "lossy" in two senses of the word, it combines multiple data points into one and it overwrites data after a certain amount is collected. You get to specify how this is done. For example, lets say I created an RRD file with the command

rrdtool create example.rrd \
[skip some necessary options]
--step 300
RRA:LAST:0.5:1:288 \ 
RRA:AVERAGE:0.5:12:168 \
RRA:AVERAGE:0.5:288:28

The step of 300 says we are collecting metrics, which rrdtool refers to as primary data points or PDPs, every 5 minutes. Each RRA line specifies four things, CF:xff:steps:rows.

1) The CF, or consolidation function. This determines how RRD combines multipe primary data points into consolidated data points, or CDPs. It can AVERAGE all the values, use the MIN imum value, use the MAX imum value, or just use the LAST value.

2) The "x files factor", is what ratio of the data must be missing before the CF will return a value of UNKNOWN rather that operating on the non-missing data.

3) The steps, which is how many primary data points are used to calculate the consolidated data point.

4) The rows, which is how many consolidated data points to keep.

In our example, the first RRA would keep your primary data points for one day, the second would average your primary data points every hour and keep the daily averages for one week, and the third would average your primary data points every day and keep the daily averages for four weeks.

If you want Munin to retain longer and more detailed data, use RRD files that have RRAs with lower steps and higher rows. This is controlled by the graph_data_size option. Munin has a human-readable syntax to make this easy to configure. The options in our earlier example would translate to

graph_data_size custom 5m for 1d, 1h for 1w, 1d for 4w

If you want to keep your primary data points for two years, you can take a shortcut and set graph_data_size to huge.

After changing this option, you have to delete your existing RRD files so Munin will create new ones with your new retention settings

sciurus
  • 12,678
  • 2
  • 31
  • 49
  • As noted in my edit, I'm not interested in a way to reduce how lossy munin is but rather a tool/deamon that *replaces* the data logger with something that uses a non-lossy DB while leaving the munin-node side (the data sources) untouched. (If I had to, I expect I could write the data logger bit in ~1kloc by using SQLite or MySQL. But that doesn't give me a dashboard.) – BCS May 10 '11 at 15:52
  • 1
    @BCS You can make munin not be lossy at all. Just set graph_data size to huge and rotate the RRD files every two years. – sciurus May 10 '11 at 17:31
2

I recently evaluated a bunch of trending / alerting tools.

At least on their agent / collector model, there seem to be 2 different models, the "nagios / request model" and the "syslog / reporting" model.

So in the active model you've got

  • Nagios: mostly for alerts but with some graphing functionality grafted on.

  • Zabbix: trending / alerting combined. Stores data in a back end SQL database (so data isn't lost / rounded as with RRD databases).

  • Munin: trending / with plugins to send data to nagios (ie you collect the data with munin then run a nagios program that looks at the local data so you don't need both a munin and nagios agent on the remote system).

The "syslog" model uses either a multicast or unicast UDP model where the monitored system sends a UDP packet to the collector every interval of time. The traffic is unsolicited; the reporting system just sends it every interval regardless of if the monitoring system is up or not.

collectd and ganglia both follow this model. I've never used ganglia but collectd has a little plugin that can report up / warn / critical status to nagios (and it also reports if it hasn't seen data from the host in 3 intervals of time so you see if a system crashed because it doesn't phone home).

Collectd has dreadful graphing / reporting tools out of the box but it outputs either / both RRD and CSV text files (name, time_t, value) so you can roll your own dashboard pretty easily.

I didn't play with ganglia too much.

BCS
  • 1,065
  • 2
  • 15
  • 24
chris
  • 11,944
  • 6
  • 42
  • 51
0

Munin uses RRDTool to store its data. With RRD-style data storage, you lose data point resolution as time goes on, so your requirement to be able to "zoom in to any point in the history" would not work.

There may be a way to get munin to use some other type of back-end storage, but I've never had the need for that so can't confirm that this is indeed possible.

EEAA
  • 109,363
  • 18
  • 175
  • 245
  • I may have mucked up the terminology; the piece I'm looking for a *replacement* for is the stuff that uses RRD. From the little I've searched, it seems that the munin (as distinct from munin-node) is fairly well attached to RRD, so I expect what I'm looking for would replace it more or less completely. – BCS May 09 '11 at 18:44
-1

Its old but munin is stil a current used metric tech. In our company we use something called MuninMX. Its a collector replacement in java with a php based frontend.

the cool thing is that we didnt need to replace munin, we just plugged in another collector and frontend. And the pro. It uses tokumx as storage backend and not rrd files.

we track around 1000 nodes with 50k plugins in total on a single quad core machine without io troubles.

also it seems that the original munin seems to move to a database configuration and json api aswell. Maybe in time munin can also store data in as example influxdb.

Daniel
  • 1
  • How does that answer the question? – Deer Hunter Dec 02 '14 at 18:22
  • @DeerHunter The answer ~"Munin is old but still current/useful" Does actually answer the question. Or tries to. I won't say it's a *good* answer, but it's still at least an attempt an an answer, in my opinion. – HopelessN00b Dec 03 '14 at 00:23