-1

I have a script running that sometimes causes the server to go 100% cpu. I suspect this is because of some regex. Is there any example input that could cause catastrophic backtracking for this regex: [A-Z]([0-9A-Z])-[1-9]([0-9])

It is used to match jira tickets patterns like ABC-123 into commit messages

readonly ticket_regexp='[A-Z]*([0-9A-Z])-[1-9]*([0-9])'
readonly commit_msg="""
* ABC-123 Added some content to the file
* DEF-456 Added some content to the file
"""

if [[ ! "${commit_msg}" =~ ${ticket_regexp} ]]; then
    echo "Does not contain the required Jira ticket reference" >&2
    exit 1
fi

Find possible values for the variable commit_msg that would cause the regex to backtrack and put cpu to 100%

nicolattu
  • 191
  • 2
  • 11

1 Answers1

2

As far as I can see, the regex isn't the problem - it's likely somewhere else in your code.

Catastrophic backtracking usually occurs with back-to-back or nested possessive quantifiers, and you don't have any examples of those in your regex. A bit of fuzz testing on regex101.com also showed that there weren't any changes to an invalid regex that made the steps to determine failure grow exponentially.

That being said, if you're still worried about backtracking and don't trust your regex, did you know Atlassian themselves have released official regexes to match JIRA ids?

Nick Reed
  • 4,989
  • 4
  • 17
  • 37