0

I am analyzing some log data from a process and have various columns such as id, date,time,log code, log text. id is unique for a product date and time are the time components when the log was captured. log code is the code specific to the log text log text is some 256 character text that describes the process

e.g.

ID  Date             time   log id          log text
A   01/10/18    9:00:00 bbb process begin
A   01/10/18    9:00:00 yyy dimensions not specified
A   01/10/18    9:00:30 fff failure
A   01/10/18    9:00:30 ddd dispatched
A   01/10/18    9:00:30 sss process success
B   01/10/18    9:01:01 bbb process begin
B   01/10/18    9:01:50 mmm moved to stage2
B   01/10/18    9:02:50 aaa space not allocated
B   01/10/18    9:02:50 fff failure

I want to grep(or rather create a subset) of the above dataset in a csv or xls output which meets the below conditions(can be changed) for example-

  1. 2 rows above the line where log text = failed
  2. all rows where log id was sss

so my expected output is -

ID  Date            time    log id  log text
A   01/10/18    9:00:00 bbb process begin
A   01/10/18    9:00:00 yyy dimensions not specified
A   01/10/18    9:00:30 fff failure
B   01/10/18    9:01:50 mmm moved to stage2
B   01/10/18    9:02:50 aaa space not allocated
B   01/10/18    9:02:50 fff failure
A   01/10/18    9:00:30 sss process success

using the discussion in the thread below: Grep for a word, and if found print 10 lines before and 10 lines after the pattern match

I tried some piece of code to get the below piece- import subprocess

filename = "filename.csv"    
string_to_search = "failure"    
extract = (subprocess.getstatusoutput("grep -C 2 '%s' %s"%(string_to_search, filename)))[1]
print(extract)
hitesh
  • 61
  • 1
  • 7

1 Answers1

2

you can use this code:

with open("text.txt", "r") as f:
    output = open("output.txt", "w")
    count = 0
    lines = f.readlines()
    for line in lines:
        if "sss" in line:
            output.write(line)
        elif "failure" in line:
            output.write(lines[lines.index(line) - 2])
            output.write(lines[lines.index(line) - 1])
            output.write(line)
Ali Hallaji
  • 3,712
  • 2
  • 29
  • 36