What can potentially cause very high sync times during Safepoint operations with low or no spin and block times?

Question

Safepoint operations in our JVM occasionally take very long to sync the threads. However the spin and block times are very low during these occurences. The common cause shows <100ms for spin but over 30seconds up to 4 minutes for sync.

Neither the Thread dumps on Safepointtimeout, nor an attempted log analysis of what is going on in those moments helped so far.

The system is a realtime RTP Streaming application using JMF (and JDK 1.7) on a Windows Server VM.

The JVM Flags used:

-server
-Xloggc:./Log/gc.log
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-Xmx2G
-XX:-OmitStackTraceInFastThrow
-XX:+PrintSafepointStatistics
–XX:PrintSafepointStatisticsCount=1
-XX:+SafepointTimeout
-XX:SafepointTimeoutDelay=20

The Logs for the Safepoints look as follows (including a few surrounding ones)

55577.063: RevokeBias                       [    2122          0              3    ]      [     0     0     0     0     0    ]  0   
55577.066: BulkRevokeBias                   [    2126        168              4    ]      [    49     0 94867     2    13    ]  0   
55671.953: RevokeBias                       [    2111          1              0    ]      [     0     0    15     1    16    ]  0   



56867.883: RevokeBias                       [    1742          0              1    ]      [     0     0     0     0     0    ]  0   
56868.738: RevokeBias                       [    1746        457              1    ]      [    88     0 39473     2     0    ]  0   
56908.219: RevokeBias                       [    1755          0              5    ]      [     0     0     0     1     0    ]  0

I would like to know what kind of circumstances could cause these massive sync times without being counted as spin or block times.

Thank you in advance

Any thread which is too busy doing something else. e.g. flushing a file to disk can do this. — Peter Lawrey, Jan 08 '19 at 15:35
@PeterLawrey is there any way of determining the offending thread? The safepointtimeout option is sadly not helpful, sometimes not even showing any threads at all. — Sonaes, Jan 08 '19 at 15:47
`-XX:+SafepointTimeout` option already prints the threads that fail to stop after the given `SafepointTimeoutDelay`. — apangin, Jan 08 '19 at 16:42
@apangin Thank you for the link. Your answer in that question was the reason i started using the SafepointTimeoutDelay param. However it usually prints several hundred Threads, or in 2 peculiar cases none at all: `# SafepointSynchronize::begin: Timeout detected:` `# SafepointSynchronize::begin: Timed out while spinning to reach a safepoint.` `# SafepointSynchronize::begin: Threads which did not reach the safepoint:` `# SafepointSynchronize::begin: (End of list)` — Sonaes, Jan 08 '19 at 16:47
@apangin the safepoint log for this timeout: `320166.625: RevokeBias [ 1960 2 5 ] [ 29 3 112 41 5 ] 0` states it is waiting for 5 threads to block — Sonaes, Jan 08 '19 at 16:50
This means the problem is not in a particular thread(s) but rather because JVM process does not receive CPU time. Low OS memory is a usual source of such problems (swapping / transparent huge pages / direct reclaim). — apangin, Jan 08 '19 at 16:59

What can potentially cause very high sync times during Safepoint operations with low or no spin and block times?

0 Answers0