7

I'm having a Jenkins pipeline consist of 3 stages. It contains multiple ongoing builds running in different agents. Agents are launched through SSH. Randomly I'm facing some builds abort with the following message.

remote file operation failed: /home/jenkins/workspace/WORKSPACE@3 at hudson.remoting.Channel@759ed7a6:: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on failed. The channel is closing down or has closed down

Getting the below error log:

Error when executing always post condition:
java.io.IOException: remote file operation failed: /home/jenkins/workspace/WORKSPACE@3 at hudson.remoting.Channel@759ed7a6:<agent_label>: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on <agent_label> failed. The channel is closing down or has closed down
    at hudson.FilePath.act(FilePath.java:1043)
    at hudson.FilePath.act(FilePath.java:1025)
    at hudson.FilePath.mkdirs(FilePath.java:1213)
    at org.jenkinsci.plugins.durabletask.FileMonitoringTask$FileMonitoringController.<init>(FileMonitoringTask.java:120)
    at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.<init>(BourneShellScript.java:198)
    at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.<init>(BourneShellScript.java:190)
    at org.jenkinsci.plugins.durabletask.BourneShellScript.launchWithCookie(BourneShellScript.java:111)
    at org.jenkinsci.plugins.durabletask.FileMonitoringTask.launch(FileMonitoringTask.java:71)
    at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.start(DurableTaskStep.java:176)
    at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:229)
    at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:153)
    at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122)
    at sun.reflect.GeneratedMethodAccessor12277.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
    at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1213)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
    at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:42)
    at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
    at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:157)
    at org.kohsuke.groovy.sandbox.GroovyInterceptor.onMethodCall(GroovyInterceptor.java:23)
    at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:133)
    at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:120)
    at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:155)
    at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:159)
    at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.methodCall(SandboxInvoker.java:17)
    at WorkflowScript.approveuser_list(WorkflowScript:67)
    at WorkflowScript.run(WorkflowScript:354)
    at ___cps.transform___(Native Method)
    at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:57)
    at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:109)
    at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:82)
    at sun.reflect.GeneratedMethodAccessor510.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
    at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:103)
    at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:82)
    at sun.reflect.GeneratedMethodAccessor510.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
    at com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:60)
    at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:109)
    at com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:82)
    at sun.reflect.GeneratedMethodAccessor510.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
    at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
    at com.cloudbees.groovy.cps.Next.step(Next.java:83)
    at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174)
    at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163)
    at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:122)
    at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:261)
    at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163)
    at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$101(SandboxContinuable.java:34)
    at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.lambda$run0$0(SandboxContinuable.java:59)
    at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.GroovySandbox.runInSandbox(GroovySandbox.java:108)
    at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:58)
    at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:174)
    at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:332)
    at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$200(CpsThreadGroup.java:83)
    at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:244)
    at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:232)
    at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:64)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:131)
    at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
    at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on <agent_label> failed. The channel is closing down or has closed down
    at hudson.remoting.Channel.call(Channel.java:948)
    at hudson.FilePath.act(FilePath.java:1036)
    at hudson.FilePath.act(FilePath.java:1025)
    at hudson.FilePath.mkdirs(FilePath.java:1213)
    at org.jenkinsci.plugins.durabletask.FileMonitoringTask$FileMonitoringController.<init>(FileMonitoringTask.java:120)
    at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.<init>(BourneShellScript.java:198)
    at org.jenkinsci.plugins.durabletask.BourneShellScript$ShellController.<init>(BourneShellScript.java:190)
    at org.jenkinsci.plugins.durabletask.BourneShellScript.launchWithCookie(BourneShellScript.java:111)
    at org.jenkinsci.plugins.durabletask.FileMonitoringTask.launch(FileMonitoringTask.java:71)
    at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.start(DurableTaskStep.java:176)
    at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:229)
    at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:153)
    at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122)
    at sun.reflect.GeneratedMethodAccessor12277.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
    at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1213)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
    at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:42)
    at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
    at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:157)
    at org.kohsuke.groovy.sandbox.GroovyInterceptor.onMethodCall(GroovyInterceptor.java:23)
    at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:133)
    at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:120)
    at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:155)
    at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:159)
    at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.methodCall(SandboxInvoker.java:17)
    ... 46 more
Caused by: java.io.IOException: Unexpected termination of the channel
    at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
Caused by: java.io.EOFException
    at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2679)
    at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3154)
    at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:862)
    at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358)
    at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
    at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36)
    at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)

Could it be resolved? Any kind of suggestions releated to this will be welcomed.

hakamairi
  • 4,464
  • 4
  • 30
  • 53
Jothi Lakshmi
  • 71
  • 1
  • 1
  • 2
  • I am facing the same issue for the last 3 days. Did you find a solution? – uncletall Oct 26 '18 at 03:15
  • I believe the problem is being traced here [Auto retry for elastic agents after channel closure](https://issues.jenkins-ci.org/browse/JENKINS-49707) – jxramos Dec 04 '18 at 19:07

3 Answers3

0

While the issue is tracked under - JENKINS-49707.
To overcome this issue ensure sufficient space is available on connected agents.
If you are already into the error state, then disconnect the agent node from Jenkins UI, check the available space in agent and again launch it using Jenkins UI.

This worked for me.

סטנלי גרונן
  • 2,917
  • 23
  • 46
  • 68
0

Any chance your Agent is started via SSH? We were facing this issue after migrating our Jenkins agents to a new server infrastructure.

Problem
Apparently on the new servers in /etc/ssh/sshd_config a RekeyLimit was configured:

RekeyLimit 512M 1h

This meant that after 512 MB of transferred data or 1h of connection time, the key was renegotiated. In our case, this renegotiation happened during steps like 'sshagent' oder 'withMaven' which lead to the exact exception you are describing.

It seems that this only happens for steps that run over a long-ish period of time, experience a key renegotiation and end with some sort of file operation - just like in your case.

Solution
We resolved it by setting RekeyLimit in etc/ssh/sshd_config on the agent to its default value:

RekeyLimit default none

and then restarting the sshd.service and Jenkins' remoting.jar.

Setting the RekeyLimit to default effectively leads to the cipher deciding when to reconnect:

$ man sshd_config
  RekeyLimit
    Specifies the maximum amount of data that may be transmitted before the session key is 
    renegotiated, optionally followed a maximum amount of time that may pass before the session 
    key is renegotiated. [...] The default is between ‘1G’ and ‘4G’, depending on the cipher.  
    [...] The default value for RekeyLimit is default none, which means that rekeying is 
    performed after the cipher's default amount of data has been sent or received and no time 
    based rekeying is done.

This may still lead to random exceptions when the cipher's default amount of data is reached. We somewhat mitigated this by configuring the connection between Jenkins Master and Agent to disconnect when no jobs are running so that we (almost) never reach the cipher's default amount of data within the same ssh session.

Markus Rohlof
  • 380
  • 3
  • 13
0

There are two jenkins server in your network with same instance. Check for that possibility. To confirm the issue, try disabling Run the following code in Jenkins' Script Console:

System.setProperty("com.nirima.jenkins.plugins.docker.DockerContainerWatchdog.enabled","false")

Please note: This will be active only until Jenkins' next restart.

Ganesh S
  • 371
  • 6
  • 26