I have an existing jenkins job that kicks off a shell script to copy my prod environment into qa.
We added a lot of data to prod (gzip dump went from 2gig to 15gig) and all of the sudden my jenkins jobs started failing.
We are running postgres 9.5 in aws and jenkins 2.171. all jenkins jobs are executed on master which is the same server with 6 executors. There are no memory/cpu/disk space issues
Tried a few things: statement_timeout
on the postgres instance is already 0
. Switching from bash to sh for some reason helped on some scripts but not others. In particular this one is still having various psql statements Killed. the script works fine when run from an interactive shell.
Also tried disabling Process Tree Killer https://wiki.jenkins.io/display/JENKINS/ProcessTreeKiller. no go.
Here's the code from two of the more innocuous commands that should run pretty quickly. $POSTGRES_HOST_OPTS
only has the db name and port:
echo -e "Running POSTGIS command"
psql $POSTGRES_HOST_OPTS -U $POSTGRES_ENV_POSTGRES_USER_PROD -d postgres -c "CREATE EXTENSION postgis;"
echo -e "Creating temporary user dv3_qa_tmp so we can rename the $POSTGRES_ENV_POSTGRES_USER_PROD user\n"
psql $POSTGRES_HOST_OPTS -U $POSTGRES_ENV_POSTGRES_USER_PROD -d postgres -c "create role dv3_qa_tmp password '$PGPASSWORD_QA' createdb createrole inherit login;"
Here's the output from jenkins console:
Waiting for new instance to be available...
-e Renaming database dv3_prod to dv3_qa
Killed
-e Running POSTGIS command
Killed
-e Creating temporary user dv3_qa_tmp so we can rename the dv3_prod_user user
Killed
-e Renaming user dv3_prod_user to dv3_qa_user
Killed
Killed
-e
All done
From the jenkins.log there is something on file descriptors but not sure how that is related. I've also tried redirecting stderr which gets rid of this message but doesn't stop the commands being killed.
Apr 10, 2019 4:23:31 PM hudson.Proc$LocalProc join
WARNING: Process leaked file descriptors. See https://jenkins.io/redirect/troubleshooting/process-leaked-file-descriptors for more information
java.lang.Exception
at hudson.Proc$LocalProc.join(Proc.java:334)
at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:741)
at hudson.model.Build$BuildExecution.build(Build.java:206)
at hudson.model.Build$BuildExecution.doRun(Build.java:163)
at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504)
at hudson.model.Run.execute(Run.java:1818)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)