9

We are experiencing a problem with our Jenkins CI server.

Our CI implementation relies on several Groovy scripts, which we execute in Jenkins as "System Groovy scripts". This has been this way for years, and the scripts have undergone no recent modifications, and implement build flows, business logic steps such as version checking, etc.

Yesterday we started experiencing an exception in every Jenkins job that we tried to lauch that, one way or another, tried to execute Groovy scripts. The exception is:

java.lang.StackOverflowError
at org.codehaus.groovy.antlr.parser.GroovyRecognizer.additiveExpression(GroovyRecognizer.java:12478)
at org.codehaus.groovy.antlr.parser.GroovyRecognizer.shiftExpression(GroovyRecognizer.java:9695)
at org.codehaus.groovy.antlr.parser.GroovyRecognizer.relationalExpression(GroovyRecognizer.java:12383)
at org.codehaus.groovy.antlr.parser.GroovyRecognizer.equalityExpression(GroovyRecognizer.java:12307)
at org.codehaus.groovy.antlr.parser.GroovyRecognizer.regexExpression(GroovyRecognizer.java:12255)
at org.codehaus.groovy.antlr.parser.GroovyRecognizer.andExpression(GroovyRecognizer.java:12223)
at org.codehaus.groovy.antlr.parser.GroovyRecognizer.exclusiveOrExpression(GroovyRecognizer.java:12191)
            hundreds of similar lines
at org.codehaus.groovy.antlr.parser.GroovyRecognizer.compoundStatement(GroovyRecognizer.java:7510)
at org.codehaus.groovy.antlr.parser.GroovyRecognizer.compatibleBodyStatement(GroovyRecognizer.java:8834)
at org.codehaus.groovy.antlr.parser.GroovyRecognizer.statement(GroovyRecognizer.java:899)
at org.codehaus.groovy.antlr.parser.GroovyRecognizer.compilationUnit(GroovyRecognizer.java:757)
at org.codehaus.groovy.antlr.AntlrParserPlugin.transformCSTIntoAST(AntlrParserPlugin.java:131)
at org.codehaus.groovy.antlr.AntlrParserPlugin.parseCST(AntlrParserPlugin.java:108)
at org.codehaus.groovy.control.SourceUnit.parse(SourceUnit.java:236)
at org.codehaus.groovy.control.CompilationUnit$1.call(CompilationUnit.java:161)
at org.codehaus.groovy.control.CompilationUnit.applyToSourceUnits(CompilationUnit.java:846)
at org.codehaus.groovy.control.CompilationUnit.doPhaseOperation(CompilationUnit.java:550)
at org.codehaus.groovy.control.CompilationUnit.processPhaseOperations(CompilationUnit.java:526)
at org.codehaus.groovy.control.CompilationUnit.compile(CompilationUnit.java:503)
at groovy.lang.GroovyClassLoader.doParseClass(GroovyClassLoader.java:302)
at groovy.lang.GroovyClassLoader.parseClass(GroovyClassLoader.java:281)
at groovy.lang.GroovyShell.parseClass(GroovyShell.java:731)
at groovy.lang.GroovyShell.parse(GroovyShell.java:743)
at groovy.lang.GroovyShell.parse(GroovyShell.java:770)
at groovy.lang.GroovyShell.parse(GroovyShell.java:761)
at groovy.lang.GroovyShell$parse.call(Unknown Source)
at com.cloudbees.plugins.flow.FlowDSL.executeFlowScript(FlowDSL.groovy:80)
at com.cloudbees.plugins.flow.FlowRun$FlyweightTaskRunnerImpl.run(FlowRun.java:219)
at hudson.model.Run.execute(Run.java:1759)
at com.cloudbees.plugins.flow.FlowRun.run(FlowRun.java:155)
at hudson.model.ResourceController.execute(ResourceController.java:89)
at hudson.model.Executor.run(Executor.java:240)
at hudson.model.OneOffExecutor.run(OneOffExecutor.java:43)

This looks like that the Groovy parser inside Jenkins is reaching the top of the stack while trying to parse the groovy script (as I have said, this abruptly started to happen with many scripts that worked perfectly before and had undergone no recent modification).

Currently our Jenkins installation (v1.594) runs on a Websphere 8.5.5.2 application server on AIX v7.1 (don't know exactly the fix pack level and / or if it has recently suffered any kind of update, still trying to gather the info).

After a restart, we returned to normal behavior (all the scripts were working as usual again without any modification to them).

Does anyone know about some incompatibility of any underlying library with Jenkins Groovy parsing?

Jorge_B
  • 9,712
  • 2
  • 17
  • 22
  • 2
    It sounds like an update. A change to the Groovy Version, or any underlying dependency could cause this kind of problem. – William Greenly Apr 15 '15 at 11:56
  • Is an OS update on AIX so bad it's considered "suffering"? – cjstehno Apr 15 '15 at 18:25
  • I was thinking about a WAS fixpack, and don't know if the right term is suffer (English is not my mother tongue). Better like: if it has recently undergone any kind of update? – Jorge_B Apr 15 '15 at 18:58
  • The operations team told me that there was no update to OS, nor to the application server at all. We saw the problem again yesterday, and had to reboot Jenkins again – Jorge_B May 26 '15 at 06:38
  • so there is a tricky bug in your script, why don't you find it and post the code here? – AdamSkywalker Dec 19 '15 at 15:16
  • I would be just so glad :( My last try was to raise the stack size playing with -Xss, which has made the error less frequent, but not removed it. Taking note of your suggestion, I have no less than 90 groovy classes and separate scripts running on that Jenkins, which presented not this problem before upgrading to Jenkins 1.594. I am starting to follow LTS too in hope that it fixes my problem – Jorge_B Dec 19 '15 at 18:21
  • 1
    I only now have spotted this question... from the version I assume you are using at least Groovy 2.3 here. And am I right in assuming that all those repeating stack trace elements are from org.codehaus.groovy.antlr.parser.GroovyRecognizer? I don't believe in a circular class reference. But if there is an overflow in this part, it means the AST is formed in an unexpected way. I would need the exact part that is repeating to get an idea what construct is causing this – blackdrag Dec 22 '15 at 15:24
  • the groovy script would really help figuring out which statements are causing the issue in this case. – dnozay Dec 23 '15 at 02:41
  • @blackdrag Thanks a lot for your answer, I will gather all the possible information the next time it happens again (we see the error once every 2 or 3 months) – Jorge_B Dec 30 '15 at 09:02
  • @blackdrag I can confirm that our current version of Groovy is the one internally embedded on Jenkins 1.594, that is, `println GroovySystem.version -> 1.8.9` You can find the full trace here: http://pastebin.com/XyXDQutM – Jorge_B Jan 19 '16 at 11:39

1 Answers1

1

There is a problem with the groovy code; causing the parser to go nuts:

java.lang.StackOverflowError
at org.codehaus.groovy.antlr.parser.GroovyRecognizer.additiveExpression(GroovyRecognizer.java:12478)

Based on a similar ticket: https://issues.apache.org/jira/browse/GROOVY-1783, it is possible that your code has circular references; or creating too many functions on the fly. You can take the approach of analyzing your code and trying to put anything that is going to make allocations outside of loops; in particular complex inline functions.

Another approach is to go look at the Build Flow plugin and scroll down the documentation and see how you could write an extension point rather than use groovy. This may not be easy to do and requires effort; but you can write a lot of tests for your code that way. You would still use groovy for the glue; but use java directly for the hot spots.

A third approach would be to file a ticket on the Groovy issue tracker; and see what the experts find out.

dnozay
  • 23,846
  • 6
  • 82
  • 104