I am working on a big log file whose entries are as follow:
-- "GET <b>/fss-w3-mtpage.php</b> HTTP/1.1" 200 0.084 41 "-" "c110bc/1.0" 127.0.0.1:25001 0.084
-- "GET <b>/m/firstpage/Services/getAll</b>?ids=ABCVDFDS,ASDASBDB,ASDBSA&requestId=091fa2b4-643e-4473-b6d8-40210b775dcf HTTP/1.1" 200
-- POST <b>/lastpage/Services/getAll</b>?ids=ABCVDFDS,ASDASBDB,ASDBSA&requestId=091fa2b4-643e-4473-b6d8-40210b775dcf HTTP/1.1" 200
And I wanted to extract the part that is bolded out in above sample. Here is the regex that I wrote for the above
.*(POST|GET)\s+(([^\?]+)|([^\s]))
I want to get the part that is after GET
or POST
and until the first occurrence of a space ' '
or a question mark '?'
.
Problem
The logical OR in the later part of the regex is not working.
If I use only
.*(POST|GET)\s+([^\?]+)
I am getting the correct portion i.e. from GET or POST until the first question mark '?'
. Similarly if I use
.*(POST|GET)\s+([^\s]+)
I am getting the correct portion i.e. from GET or POST until the first space ' '
).
Please can anyone tell me where I am wrong?