0

I am parsing IBM JVM thread dumps to extract various information. In current context the lock id on which the thread is waiting, and the id of the thread owning that lock. The lock id is consistent across all dumps and it takes form of hex value 0x000000000B0D9A20. But id of thread holding lock takes different values like hex id of the same form as lock id or something like <unknown> or <unowned>. It is lock owner's id that I am finding difficult to extract.

IBM thread dump specifies lock info with three clauses (as you can see in first screenshot):

  1. Waiting on...
  2. Blocked on...
  3. Parked on...

I am performing OR operation on these clauses in regex.

I have written a generic method which accepts

  1. thread dump line
  2. regex to apply
  3. Number of groups to return in a list

For example the call method1(threadDumpLine,regex,2) will apply regex on threadDumpLine and return the list [group1,group2] where group1 is a 1st group captured while group2 is 2nd one.

The first group need to be the lock id, which I am able to capture but second group can be either hex id or <unknown> or <unowned>. I am able to capture lock owner's id as 2nd group, but when it is <unowned> or <unknown> then it turns out to be 3rd group:

enter image description here (above on regex101)

That was happening because I defined two separate groups to capture lock owner's id and <text>. So I tried to combine the two in one as follows:

enter image description here (above on regex101)

How can I change the regex to capture as specified above with least possible steps?

Mahesha999
  • 22,693
  • 29
  • 116
  • 189
  • It would be nice to have a look at all data cause there could be exceptions, but here is one suggestion `(?:Blocked on|Parked on|Waiting on):[^@]*@(0x[0-9A-F]*) Owned by:.*?(<[^>]+>|0x[0-9A-F]*).*` – i-- May 03 '16 at 14:26
  • Note: The `' on'` text could be outside the `(?:...)` like `(?:Blocked|Parked|Waiting) on` – Washington Guedes May 03 '16 at 14:33

2 Answers2

0

You can use this negation based regex to get the right captured group #2:

(?:Blocked on|Parked on|Waiting on):[^@]*@(0x[0-9A-F]+) Owned by:[^<\n]*(0x[0-9]+|[^>\n]+)

RegEx Demo

This will give following match data:

MATCH 1
1.  [69-87] `0x000000000B0D9A20`
2.  [185-186]   `)`

MATCH 2
1.  [288-306]   `0x000000000296F1E8`
2.  [317-325]   `<unknown`

MATCH 3
1.  [466-484]   `0x0000000030A0C590`
2.  [495-503]   `<unowned`
anubhava
  • 761,203
  • 64
  • 569
  • 643
0

Try this:

(?:Blocked on|Parked on|Waiting on):[^@]*@(0x[0-9A-F]+) Owned by:[^<\n]*?(0x[0-9A-F]+|<.*?>)

output:

MATCH 1
    > 1.    [69-87]     `0x000000000B0D9A20`
    > 2.    [130-148]   `0x00000000846F4900` 
MATCH 2
    > 1.    [288-306]   `0x000000000296F1E8`
    > 2.    [317-326]   `<unknown>` 
MATCH 3
    > 1.    [466-484]   `0x0000000030A0C590`
    > 2.    [495-504]   `<unowned>`

demo

Scott Weaver
  • 7,192
  • 2
  • 31
  • 43