0

I wish to transform my nginx logs using a reqular expression script as follows:

original log:

07.21.99.178 - - [01/Jun/2012:12:06:23 +0530] "GET /api?playSessionId=live_21_bc206d95-113f-4b49-989b-7dff77af51c410.190.217.2111338532565422 HTTP/1.1" 200 71 "-" "Jakarta Commons-HttpClient/3.1"

I want the playSessionId as the output for which I used the following script:

#!/usr/bin/env ruby

mon={"Jan" => '01',"Feb" => '02',"Mar" => '03',"Apr" => '04',"May" => '05',"Jun" =>        '06',"Jul" => '07',"Aug" => '08',"Sep" => '09',"Oct" => '10',"Nov" => '11',"Dec" => '12'}

STDIN.each_line do |line|
if line =~ /([\d+|\.]+) (\d+)\/(\w+)\/(\d+):(\d+):\d+:\d+ \+\d+] "GET \/api\?playSessionId=(^&*)/
d = "#{$3}-#{mon$2}-#{$1}"
h = $4
pid = $5
puts "#{d}\t#{h}\t#{pid}"
end
 end

But this doesn't seem to work :( can someone tell me the java regex for this so that i can sue rlike on hive?

princess of persia
  • 2,222
  • 4
  • 26
  • 43
  • 1
    If I understand correctly, you want this string `live_21_bc206d95-113f-4b49-989b-7dff77af51c410.190.217.2111338532565422` – m0skit0 Jun 14 '12 at 09:39

2 Answers2

0

I think you want this:

/([\d+|\.]+) (\d+)\/(\w+)\/(\d+):(\d+):\d+:\d+ \+\d+] "GET \/api\?playSessionId=([^ &]+)/

You might want to use standard unix tools for this job instead (grep + sed):

grep 'playSessionId=' foo.log | sed 's/^.*playSessionId=\([^ ]*\).*$/\1/'
troelskn
  • 115,121
  • 27
  • 131
  • 155
0

I think youre regex is way too complicated. This will do the job:

playSessionId=(.*)\s
m0skit0
  • 25,268
  • 11
  • 79
  • 127