I am parsing IBM JVM thread dumps to extract various information. In current context the lock id on which the thread is waiting, and the id of the thread owning that lock. The lock id is consistent across all dumps and it takes form of hex value 0x000000000B0D9A20
. But id of thread holding lock takes different values like hex id of the same form as lock id or something like <unknown>
or <unowned>
. It is lock owner's id that I am finding difficult to extract.
IBM thread dump specifies lock info with three clauses (as you can see in first screenshot):
Waiting on...
Blocked on...
Parked on...
I am performing OR operation on these clauses in regex.
I have written a generic method which accepts
- thread dump line
- regex to apply
- Number of groups to return in a list
For example the call method1(threadDumpLine,regex,2)
will apply regex
on threadDumpLine
and return the list [group1,group2]
where group1
is a 1st group captured while group2
is 2nd one.
The first group need to be the lock id, which I am able to capture but second group can be either hex id or <unknown>
or <unowned>
. I am able to capture lock owner's id as 2nd group, but when it is <unowned>
or <unknown>
then it turns out to be 3rd group:
(above on regex101)
That was happening because I defined two separate groups to capture lock owner's id and <text>
. So I tried to combine the two in one as follows:
(above on regex101)
How can I change the regex to capture as specified above with least possible steps?