2

I am support for an application that uses two Windows services to perform two very different tasks. One controls scheduled executions of a app, one runs continuously listening for HL7 messages being transmitted on a particular port. Both of them have a propensity to fail. We have identified that failing connections to database servers, network shares and the like is the likely culprit of the failures and the services most times restart with no issue. The problem is sometimes the services stop working but continue to display a status of 'Running' in Windows service mgr. and if you use command line query of the service. It is only when you then proactively attempt to stop and restart the service that you realize the service is stopped as attempting to stop it from the Service mgr console times out and returns a generic 'Service is not responding in a timely manner.' message. You then have to kill the process thread it is running on to stop it.

I would like to know two things: One: Is there a way of monitoring services that would be able to return more information than just the reported status of the service? Open to third party options.

Two: Is it possible to use the default Windows service recovery options to do a scheduled restart of the service that would function through the time out and error message when attempting to stop the service if it were in the misreporting status state?

Apologies for the wordiness. Trying to balance as much information as useful vs spewing out the hours of putzing about I have spent working on this.

Tom Benson
  • 21
  • 2

3 Answers3

2

I would suggest to set "automatic service restart" in Service properties or in case the needed service returns an event to event viewer. You can create a scheduled task, which will be triggered by specific event id and then restart the service.

Mr. Raspberry
  • 3,918
  • 13
  • 32
  • 1
    I understand the logic behind using an event id trigger to restart the service, but will that be able to do so given that when you go to restart the service it will be misreporting the status and will timeout when you try to stop it. Or does this have some provision for killing the service by stopping the process identified by PID? – Tom Benson Feb 01 '18 at 13:45
1

So the service is still reported as running, but not delivering the normal functions that is should be.... Try using performance monitor to see if one of the process counters for this service goes out of whack when it stops operating normally. If you can find performance data to indicate the service is unhealthy:

  • Many 3rd party tools can be configured to restart the service based on a performance counter trigger.
  • You can setup a data collector of the type performance counter alert in perfmon to cycle the service when that threshold is reached. This option is rather hair trigger in terms of response. For example, if you want to wait for the threshold to be exceeded for say at least 1 minute before restarting the service, this is not a good option. If that is not a concern See How can I monitor memory usage for a windows-based JVM and trigger an alert if it gets too high?
  • You could also monitor a perf counter with a scheduled task using Powershell and Get-Counter -maxSamples 999 -sampleInterval 999 -counter XXX to work around the hair trigger nature of the former.
Clayton
  • 4,523
  • 17
  • 24
0

Nagios Core is a free tool that can be used to monitor Windows services, and can automatically restart services that fail. They have a paid product as well (Nagios XI) that is great for larger environments.

Here's where you can find the pitch and download link: https://www.nagios.com/solutions/windows-service-monitoring/

An example of using an event handler to start a process can be found here: https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/3/en/eventhandlers.html

sippybear
  • 3,197
  • 1
  • 13
  • 12