0

There is a strange thing happening on our production box. Code functionality: A UI servlet takes a monitor lock on the document object which is being actioned upon by the user and performs some computation on it. The monitor lock is acquired to prevent the same document object from getting modified concurrently by multiple users simultaneously.

Issue Observed in Prod: Few user actions are getting timed out.

Log Analysis: The thread corresponding to the timed out user actions is printing all logs prior to acquiring the monitor lock on the document object. Then there is a gap of over 1 hour where the thread is not surfacing up in the logs and then it suddenly becomes alive and does the computation and attempts to send back a response which obviously errors out as the HTTP request has already timed out. We have checked the logs and code and can confirm that there is no other thread which had acquired the monitor lock on that particular document object. So the lock was uncontested at the point in question.

What could be the possible issue? Is it just that the thread was put into a Runnable state on encountering a synchronized block and for the next 60-80 mins, the CPU never got a chance to run this particular runnable thread?

RGJ
  • 11
  • 2
  • 1
    "Is it just that the thread was put into a Runnable state on encountering a synchronized block and for the next 60-80 mins, the CPU never got a chance to run this particular runnable thread?" Sounds very unlikely to me. – mm759 Nov 01 '16 at 07:25
  • 1
    You can use jstack to try to get more information. – mm759 Nov 01 '16 at 07:26
  • What else happened immediately before the thread became alive again? Maybe there are other locks involved? If you get a stack trace dump, that will show all the locks the threads hold or wait for. – Thilo Nov 01 '16 at 07:49

1 Answers1

0

Ensure the application code is not messing around with thread priority via Thread.setPriority() method or the like. If you're using an IDE like IntelliJ and the Java sources are available, and assuming you can run the application and relevant flow locally in your development machine, you can put a breakpoint in Thread.setPriority() to see if anywhere it is getting invoked. This is an excerpt from Java Concurrency in Practice, Goetz 2006, regarding how unpredictable behavior can be if you try to set Thread priority manually:

10.3.1. Starvation Starvation occurs when a thread is perpetually denied access to resources it needs in order to make progress; the most commonly starved resource is CPU cycles. Starvation in Java applications can be caused by inappropriate use of thread priorities. It can also be caused by executing nonterminating constructs (infinite loops or resource waits that do not terminate) with a lock held, since other threads that need that lock will never be able to acquire it. The thread priorities defined in the Thread API are merely scheduling hints. The Thread API defines ten priority levels that the JVM can map to operating system scheduling priorities as it sees fit. This mapping is platformspecific, so two Java priorities can map to the same OS priority on one system and different OS priorities on another. Some operating systems have fewer than ten priority levels, in which case multiple Java priorities map to the same OS priority. Operating system schedulers go to great lengths to provide scheduling fairness and liveness beyond that required by the Java Language Specification. In most Java applications, all application threads have the same priority, Thread. NORM_PRIORITY. The thread priority mechanism is a blunt instrument, and it's not always obvious what effect changing priorities will have; boosting a thread's priority might do nothing or might always cause one thread to be scheduled in preference to the other, causing starvation. It is generally wise to resist the temptation to tweak thread priorities. As soon as you start modifying priorities, the behavior of your application becomes platform specific and you introduce the risk of starvation. You can often spot a program that is trying to recover from priority tweaking or other responsiveness problems by the presence of Thread.sleep or Thread.yield calls in odd places, in an attempt to give more time to lower priority threads.[5]

Jose Quijada
  • 558
  • 6
  • 13