0

I just read through some icinga documentation regarding the performance data collection and processing. But there are some things unclear to me right now:

  1. Writing to file/disk -> does this roll over and how?
  2. I would like to skip the disk 'buffer' and pipe directly into a post processing script that puts data into an external database. Is this possible and how? (I saw there is a pipe mode but it is not fully visible to me how this works, as most example and setups use the files). What are the risks of using pipe if db is not reachable or the data receiving process may die?
  3. Load performance on a busy box if intermediary files are used - we experienced some high load and are unsure if not pipe through would be better (except for some failure scenarios, see second question)

Thanks a lot!

ps: tagged under nagios as icinga isn't available yet and I don't have sufficient points so far ;-).

dalini
  • 176
  • 1
  • 9

1 Answers1

0

ad 1) depending on the method used, you may define a file rotation interval plus a command moving the files with their timestamp suffix. there are famous grapher addons out there, like pnp4nagios or ingraph, which describe that as their requirements - e.g. http://docs.pnp4nagios.org/pnp-0.6/config#bulk_mode_with_npcd regarding rotation by the core - you need to make sure that $something processes the performance data files - and you should monitor that processing not to end up with "filesystem full" or likewise.

ad 2) directly piping data from the core to an external handler is possible by defining a command doing that - but keep in mind that this won't happen async and may block the core - your processing application needs to take the data being feed, and put it onto a queue itsself then. this might also generate a problem if the database is gone - if your handler cannot kill itsself due to a connection timeout, this will harm the overall performance of your monitoring core (yeah, that's a known problem of the 1.x architecture, which is why the spool files on disk are a better approach).

ad 3) not sure if i got that correctly, but there are some things you really should keep in mind when using the spool files on disk

  • if rotation happens between different filesystems, the inode mv will take longer than on the same
  • if using the same filesystem, make sure your underlaying hardware (raid, hdd) is fast enough
  • you should of course put all the temporary data created by icinga onto the same filesystem then - but not the same as your database or rrdtool storage is located onto
  • if you don't care about the not-processed spool data, create a tmpfs and configure them there
  • do not use advanced filesystems with snapshot/backup functionality for such transitional data, such as zfs/xfs/btrfs - this will significantly decrease performance on large scale systems.
  • monitor io wait and usage of your filesystems to get an idea about possible bottlenecks
  • if processing happens with rrdtool afterwards, make sure to use rrdcached to speed up the processing application

going the way back to the synchronous mode will require your processing application to use sort of a queue itsself, and this is not what direct database access will use. even ingraph (https://www.netways.org/projects/ingraph/wiki) is built with the collector daemon inserting data to the database then. in short - doing that with 1.x is dangerous, with icinga2 it will be possible then, having its own queueing mechanism.

dnsmichi
  • 466
  • 3
  • 11
  • Great - ad 1) works just great! We even dump it several times and each backend can fetch its data as needed. No blocking, just keep an eye on the size at file systems of the icinga! – dalini Mar 03 '14 at 12:44