2

We are using Jenkins (recently updated) with the Ansible plugin and the Ansible Tower plugin to connect to our AWX tower. Most of the time, it works great, but lately, the tower will sometimes not respond correctly to Jenkins. Again, this does not happen always, but frequently enough to be a major concern.

When the issue occurs, the error messages I receive in Jenkins are along these lines:

ERROR: Failed to get job status from Tower: Unable to make tower request: Connection reset

ERROR: Failed to get job events from tower: Unexpected error code returned (503)

The normal response should be:

Tower completed the requested job

The option "Enable Debugging" is enabled for the Ansible Tower, but I have not seen any additional output in the Jenkins job logs so far.

Last time the connection failed, I went into the Jenkins settings and clicked "Test Connection" for the Ansible Tower plugin, and it worked right away.

I have not seen the web interface fail, and the jobs do complete normally. The issue lies in communication between Jenkins and AWX.

Jenkins and all the plugins were recently updated.

The person who installed AWX is no longer with us, and I don't know where else to go to help me troubleshoot this.

Versions:

  • AWX version: 9.0.0.0
  • AWX install method: openshift sts
  • Ansible version: 2.8.5
  • Operating System: N/A
  • Web Browser: N/A
  • Jenkins: 2.204.2
  • Jenkins Ansible plugin: 1.0
  • Jenkins Ansible Tower plugin: 0.14.0

In the Jenkins pipeline, the following code handles the Ansible part:

wrap([$class: 'AnsiColorBuildWrapper', colorMapName: "xterm"]) {
ansibleTower( [parameters] )

I don't have access to Jenkins on the file system level, only the general web UI.

I'd appreciate any troubleshooting steps you could provide or advice on where else to ask.

Community
  • 1
  • 1
semmelbroesel
  • 543
  • 1
  • 8
  • 27
  • 1
    You have to look why AWX answers with 503 in AWX's logs. – Tony Stark Apr 06 '20 at 10:27
  • @TonyStark I'm comparing logs between a working job of the same template and the one that did not respond back. I'm not seeing any error messages or major differences. The job completed normally and is sending notifications. AWX is installed in Openshift as STS - I'm only looking at the main log in awx-celery - is there another spot where I should check? I SSH'd into 2 pods and couldn't find anything else useful so far. – semmelbroesel Apr 08 '20 at 14:40
  • For completeness, here is another error message I have come across: "Unable to lookup job template Unable to find job template: Unable to get oauth token, server responded with (503)" The Jenkins log unfortunately only holds a few hours, so I have to catch this right away, and it happens intermittently. I'm about to try running a quick Jenkins job multiple times in a row to see if I can get it to fail. – semmelbroesel Apr 09 '20 at 16:32
  • I finally got it to fail while I was watching, and the only part in the Jenkins log that caught my eye (and that matched the Docker slave) was: "Failed to send back a reply to the request hudson.remoting.Request$2@[...]: hudson.remoting.ChannelClosedException: Channel hudson.remoting.Channel@[...]:docker-[...]: channel is already closed" - still hoping for an answer on what could cause this to only happen once in a while. – semmelbroesel Apr 09 '20 at 21:01

0 Answers0