1

I want to monitor some servers and some of them have the potential of having very unstable connectivity.

This is because they are located in office locations that only have consumer internet access.

My worries with the usual approach - to have a http check or something every minute or another interval - are short outages that may not be noticed because they occour between the checks, but can very likely produce a lack in sensitive service connections or interrupt them completely.

In fact we are experiencing such problems at a location, I already spent hours shouting at the cable company on the phone.

We are currently evaluating the possibility to get a direct carrier uplink to level(3) which would be pretty awesome and would make my life a lot easier.

But in order to do that I need some reliable numbers and not some vague experience reports.

So what I would like is some uptime monitor that opens a socket connection keeps it open and requests a response in a fairly short interval (20 seconds maby).

When theres no response the connection must have been interrupted since it the last check.

I would also use this approach for for a "better" server monitoring in future.

So my questions are:

  • Is any wrong with my understanding of how networks work - is something like that a good idea?
  • Is there any solution out there that I can use right away or do I have do do it myselfĂ„?
Ben Pilbrow
  • 12,041
  • 5
  • 36
  • 57
The Shurrican
  • 2,240
  • 7
  • 39
  • 60

2 Answers2

2

I think you should maybe approach the monitoring in terms of packet loss and latency. You can use some tools to ping with constantly every second and another tool like Nagios to E-mail about problems longer than a set period. Combine this with some well thought out external monitoring points to give a broader picture of what is failing, when, and you should get some good metrics to fight your corner.

Jonathan Ross
  • 2,183
  • 11
  • 14
1

Why not monitor all of the sensitive services rather than just a simple ping. Check sql, http, ldap (or whatever). Most monitoring tools today can do deep service checks. Then you can care less about the connection, just that the apps are running well.

uSlackr
  • 6,412
  • 21
  • 37