3

Below is a sample log file data:

08/22/2018 02:50:06.380 EDT-0400 2 TCP/IP Controller Plugin.Transmitter pool thread <Regular:2>.CybTargetHandlerChannel.call[:695] - Message has been sent: 20180822 02500636+0400 C7STA PLINUX03 ALOPMTA2.N01834/LO.S00001D182340248/MAIN State EXEC SetStart Status(Executing at PLINUX03) Jobno(34523) ChildPid(34527)  User(PLINUX03) Host(localhost)
08/22/2018 02:50:06.382 EDT-0400 5 TCP/IP Controller Plugin.Transmitter pool thread <Regular:2>.CybTargetHandlerChannelLogHelper.logConnectionClose[:133] - Conversation with C7STA closed
08/22/2018 02:51:21.761 EDT-0400 5 TCP/IP Controller Plugin.Transmitter pool thread <Regular:1>.CybTargetHandlerChannel.call[:666] - Attempting to send:    20180822 02512176+0400 C7STA PLINUX03 ALOECPC7.N01745/LO.S00002D182340242/MAIN State COMPLETE Cmpc(0) SetEnd  User(PLINUX03) Host(localhost)
08/22/2018 02:51:21.771 EDT-0400 2 TCP/IP Controller Plugin.Transmitter pool thread <Regular:1>.CybTargetHandlerChannel.call[:695] - Message has been sent: 20180822 02512176+0400 C7STA PLINUX03 ALOECPC7.N01745/LO.S00002D182340242/MAIN State COMPLETE Cmpc(0) SetEnd  User(PLINUX03) Host(localhost)

I was trying to extract five fields below from the first and fourth line which contains "Message has been sent":

  1. TimeStamps: 20180822 02500636+0400, 20180822 02512176+0400
  2. JobNames : ALOPMTA2,ALOECPC7
  3. JobNumbers : 01834,1745
  4. Users : User(PLINUX03), User(PLINUX03)
  5. Statuses : MAIN State EXEC SetStart, MAIN State COMPLETE

I was able to filter lines containing "Message has been sent:" using below expression, but was not sure on extracting 5 fields from this line:

^.*\b(Message has been sent:.)\b.*$

Can someone help? This is for extraction on Splunk. Thank you!

AncientSwordRage
  • 7,086
  • 19
  • 90
  • 173
user292033
  • 33
  • 4
  • Not a problem, I've edited that in for you. Hopefully someone can help. What have you tried so far, and what guides or other resources have you been reading? – AncientSwordRage Aug 22 '18 at 13:43
  • I am new to regex and was able to filter out the first and four lines using the below code: ^.*\b(Message has been sent:.)\b.*$ I was using regexer for some quick trial and error, but couldn't pin point in fetching the five fields – user292033 Aug 22 '18 at 14:14
  • I'm not familiar with splunk. Have you read the [documentation](http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/Regex) for instance? – AncientSwordRage Aug 22 '18 at 14:18
  • Yes, it said that the regex needs to be Perl compatible. Should i add Perl also as a tag? – user292033 Aug 22 '18 at 14:26

1 Answers1

2

I suggest you this regex :

Message has been sent: (?<timestamp>\d{8}\s\d{8}\+\d{4})\s\w+\s\w+\s(?<jobname>\w+)\.N(?<jobnumber>\d+)\/[^\/]+\/(?<statuses>(\w+\s)+)\w+\(.+User\((?<user>\w+)\)
  • Group 'timestamp' (\d{8}\s\d{8}\+\d{4}) : matches the timestamps
  • Group 'jobname' \s(\w+)\.N : matches the jobs names
  • Group 'jobnumber' \.N(\d+)\/ : matches the jobs numbers
  • Group 'statuses' ((\w+\s)+) : matches the statuses
  • Group 'user' User\((\w+)\) : matches the users

You can see an example here with the data you provided : https://regex101.com/r/G6GD46/4

Do not hesitate to play with this example to get the result you need.

Tell me if you need more explanation for these regexs'.

Edit: as suggested by @RichG in the comments, I've added named groups to allow Splunk to extract groups as variables.

Anthony BONNIER
  • 355
  • 2
  • 8
  • 1
    Include group names in the regex string so Splunk will extract fields. `index=foo "Message has been sent" | rex "sent: (?\d{8}\s\d{8}\+\d{4})\s\w+\s\w+\s(?\w+)\.N(?\d+)\/[^\/]+\/(?(\w+\s)+)\w+\(.+User\((?\w+)\)" | ...` – RichG Aug 22 '18 at 18:36