2

I'm trying to manage Splunk with Chef and ran across a problem when using Chef to programmatically start/stop/restart the Splunkforwarder service:

The request did not respond to the start or control request in a timely fashion. - ControlService: The service did not respond to the start or control request in a timely fashion.

Once the Chef run fails, I'm able to start the service with net start splunkforwarder.

I decided to try going through the restart (stop/start) process manually in Powershell, and now when I try to stop the service with net stop splunkfowarder I get an error:

The service is not responding to the control function.

Somtimes stopping the service works, but rarely. At this point I have no idea what's going on because I'm not used to working on Windows (or with Splunk), and am not sure if the error from Chef and the net stop splunkforwarder errors are related.

I've also found if I interact with splunk directly through the executable file in C:\Program Files\SplunkUniversalForwarder\bin\splunk.exe that everything works fine. I can cd to that directory and run ./splunk restart without problems.

Anyone know what's going on or have advice on next-steps for troubleshooting?

vsminkov
  • 10,912
  • 2
  • 38
  • 50
sixty4bit
  • 7,422
  • 7
  • 33
  • 57
  • If you are getting the same error manually, this is unlikely to be related to Chef. Going to remove the tag. – coderanger Sep 06 '16 at 18:50
  • `net stop splunkforwarder` isn't going through it in PowerShell, that would be `Get-Service` and `Restart-Service` cmdlets. (Not that I expect those to behave differently if Splunk is not responding to service control requests, but there is a `-Force` parameter available). – TessellatingHeckler Sep 06 '16 at 19:07
  • 1
    This really isn't a programming question, but a question as to why Splunk doesn't respond to a service stop command (report it to Splunk--they should be able to determine what thread is blocking from a process dump, or a series of process dumps). I have another service with a similar problem, and I use a powershell script to stop the service (`stop-service`), and then sleep a bit, and then check to see if the process is running. If the process is still running, I use `stop-process` to kill it and then `start-service` to restart it. – Tony Hinkle Sep 06 '16 at 19:12
  • And to specifically answer your question, "advice on next steps," contacting Splunk support to report the problem is the best thing to do unless you find documenation of a known issue with information on how to address the issue or what newer version has a fix for that defect. – Tony Hinkle Sep 06 '16 at 19:17
  • where would be a better place (StackExchange site) to ask the question? – sixty4bit Sep 06 '16 at 19:18
  • http://serverfault.com/ – Ben Lavender Sep 06 '16 at 21:03

1 Answers1

3

This does belong on ServerFault.

However.

Stopping a service with net.exe actually does two things:

  1. Sends the stop command to the service.
  2. Wait an amount of time (either 30s or 60s, IIRC) to see if the service has stopped. If it has, report success. Else, error.

My guess it that net.exe stop splunk is hitting whatever timeout that net.exe uses sometimes.

What you can do instead is:

sc.exe stop splunk

The sc.exe command will only do step 1. It sends the stop command and immediately returns. The PowerShell cmdlet Stop-Service will do the same thing, IIRC. Note that net.exe and sc.exe are not native PowerShell commands or cmdlets. They're standard programs.

You can also do this to wait, say, 5 seconds:

$svc = Get-Service splunk;
$svc.Stop();
$svc.WaitForStatus('Stopped','00:00:05');

And then you can look at $svc.Status to see what it's doing.

Or you can tell it to wait indefinitely:

$svc = Get-Service splunk;
$svc.Stop();
$svc.WaitForStatus('Stopped');

Get-Service returns an object of type System.ServiceProcess.ServiceController. You can take a look at $svc | Get-Member or $svc | Format-List to get more information about the object and what you can do with it.

If you want your script to be able to kill the process, you'll probably need to get the PID. That's somewhat more complex because the above class doesn't expose the PID for some stupid reason. The typical method is WMI:

$wmisvc_pid = (Get-WmiObject -Class Win32_Service -Filter "Name = 'splunk'").ProcessId;
Stop-Process $wmisvc_pid -Force;

WMI also exposes it's own Start() and Stop() methods, and is particularly useful because Start-Service and Stop-Service don't work remotely, but WMI does.

Bacon Bits
  • 30,782
  • 5
  • 59
  • 66