One day our java web application goes up to 100% CPU usage. A restart solve the incident but not the problem because a few hours after the problem came back. We suspected a infinite loop introduced by a new version but we didn't make any change on the code or on the server.
We managed to find the problem by making several thread dumps with kill -QUIT and by looking and comparing every thread details. We found that one thread call stack appear in all the thread dumps. After analysis, there was a while loop condition that never go false for some data that was regularly updated in the database.
The analysis of several thread dumps of web application is really tedious.
So do you know any better way or tools to find such issue in a production environment ?