3

I am trying to generate GROK patterns automatically using LogMine

Log sample:

Error   IGXL    error [Slot 2, Chan 16, Site 0] HSDMPI:0217 : TSC3 Fifo Edge EG0-7 Underflow.  Please check the timing programming.  Edge events should be fired in the sequence and the time between two edges should be more than 2 MOSC ticks.    
Error   IGXL    error [Slot 2, Chan 18, Site 0] HSDMPI:0217 : TSC3 Fifo Edge EG0-7 Underflow.  Please check the timing programming.  Edge events should be fired in the sequence and the time between two edges should be more than 2 MOSC ticks.    

For the above logs, I am getting the following pattern:

re.compile('^(?P<Event>.*?)\\s+(?P<Tester>.*?)\\s+(?P<State>.*?)\\s+(?P<Slot>.*?)\\s+(?P<Instrument>.*?)\\s+(?P<Content1>.*?):\\s+(?P<Content>.*?)$') 

But I expect a Grok Pattern(Logstash) that looks like this:

%{LOGLEVEL:level}    *%{DATA:Instrument} %{LOGLEVEL:State} \[%{DATA:slot} %{DATA:slot} %{DATA:channel} %{DATA:channel} %{DATA:Site}] %{DATA:Tester} : %{DATA:Content}    

Code: LogMine is imported from the following link: https://github.com/logpai/logparser/tree/master/logparser/LogMine

import sys    
import os    
sys.path.append('../')    
import LogMine    

input_dir  ='E:\LogMine\LogMine' # The input directory of log file    
output_dir ='E:\LogMine\LogMine/output/' # The output directory of parsing  results    
log_file   ='E:\LogMine\LogMine/log_teradyne.txt' # The input log file name    
log_format ='<Event> <Tester> <State> <Slot> <Instrument><content> <contents> <context> <desc> <junk> ' # HDFS log format     
levels     =1 # The levels of hierarchy of patterns     
max_dist   =0.001 # The maximum distance between any log message in a cluster and the cluster representative     
k          =1 # The message distance weight (default: 1)     
regex      =[]  # Regular expression list for optional preprocessing (default: [])     

print(os.getcwd())    
parser = LogMine.LogParser(input_dir, output_dir, log_format, rex=regex, levels=levels, max_dist=max_dist, k=k)     
parser.parse(log_file)    

This code returns only the parsed CSV file, I am looking to generate the GROK Patterns and use it later in a Logstash application to parse the logs.

mihomir
  • 197
  • 6
  • 15
Rakesh TS
  • 47
  • 4
  • Awesome idea! not sure how to do it though. I'm really interested in figuring out a way to duplicate this https://www.datadoghq.com/blog/log-patterns/ functionality in something like ElasticStack or GrafanaLoki. – neoakris Jul 28 '19 at 21:25

0 Answers0