2

I have two windows machines, one with Jenkins master and one Jenkins slave. On both machines Jenkins is installed as a service and the slave is configured to be taken offline after 300 minutes of inactivity. Software tests should be executed on both machines during the night. Often when I check in the morning I find the following situation:

  • Jenkins master is up and running, all tests were executed on this machine.
  • Several jobs are in starvation mode because the slave is offline.
  • Jenkins slave windows service is stopped.
  • Restarting the master and starting a job on the slave node does not bring the slave online.

No useful error information can be found on the slave. The last lines in jenkins-slave.err.log are:

INFO: Connected
Apr 01, 2019 3:40:23 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Terminated
Apr 01, 2019 3:40:33 PM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$FindEffectiveRestarters$1 onReconnect
INFO: Restarting agent via jenkins.slaves.restarter.WinswSlaveRestarter@99751ad

The master just prints a lot of lines like

Apr 02, 2019 9:08:23 AM hudson.slaves.RetentionStrategy$Demand check
INFO: Disconnecting computer XYZ as it has been idle for 23 hr

The slave.log on the master does not help either:

Remoting version: 3.27
This is a Windows agent
Agent successfully connected and online
ERROR: Connection terminated
java.nio.channels.ClosedChannelException

I found an event in the windows event viewer saying:

The Jenkins agent (jenkinsslave-C__Program Files (x86)_Jenkins-Slave) service failed to start due to the following error: 
The service did not respond to the start or control request in a timely fashion.

I added to master and slave execution command lines :

-Dhudson.lifecycle=hudson.lifecycle.WindowsServiceLifecycle

Once I manually start the windows service on the slave machine, it comes back online and jobs continue.

I often get the impression that this has something to do with windows updates being installed automatically on the master. But if that is the problem, how could I make the slave connect?

I am thankful for any ideas why this is happening or how I can investigate this issue further.

Meera
  • 318
  • 3
  • 16
  • Had you come across a viable fix for this issue?? Seems like it is currently open a [minor bug](https://issues.jenkins-ci.org/browse/JENKINS-50219?focusedCommentId=355293&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-355293) in Jenkins – Gilberto Treviño Sep 09 '19 at 14:25
  • 1
    I did not find how to fix this. I configured the slaves to stay connected all the time even through long inactivity, which seems to work well as a workaround. – Meera Sep 09 '19 at 14:37
  • Another problem is, that when I restart Jenknis, it often stays shutdown and I have to restart the service manually. I do not know why, maybe the same cause. – Meera Sep 09 '19 at 14:38
  • 1
    I have the nodes configured to be up as much as possible. Actually I ended up here because some times when the master is restarted, some nodes show this behavior... Normally I would just start them manually, but the number of nodes just passed 50 and today 70% of them were down so it is starting to be an issue... But oh well... guess I will keep looking into this. Thank you for your prompt response. =) – Gilberto Treviño Sep 09 '19 at 15:54

0 Answers0