I have a portal which is running over SSL on Glassfish and uses Liferay. Last time we sent a email that brings approximately 200 people at same time to access released information our Glassfish "stalled".
From the server we could see that system resources were ok. - Glassfish has up to 8 GB to use but was using 5 GB - The server has 4 CPUs and the overall usage was around 30% - Glassfish is configured up to 400 HTTP threads.
As soon we detected that our server wasn't answering users we started a profiler in order to understand what was going on.
The threads overview show too many blocked threads:
From the stack it's no possible to see code other than sun, grizzly, catalina classes:
I would like to fix such issue but right now I can tell whether I should work on our code our should replace some component like disabling SSL.
Any thoughts would be very appreciated.
Thanks.