1

I have a file named varout.txt, which contains a text as given below:

Message: unable to locate element

I have used the below command to fetching the text between the word Message and element:

result = subprocess.run(['grep -oP \'(?<(Message)).*(?= element)\' /home/ubuntu/varout.txt'],shell=True,capture_output=True)
reason = result.stdout
print(reason)

But I am getting below as my output:

b' : unable to locate`/n'

Where expected output should be as below, where I am going wrong ??

': unable to locate'
  • Looks like you made a small typo in the code, `(?<(Message))` must be `(?<=(Message))` in your codes, otherwise, you wouldn't have obtained any result. – Wiktor Stribiżew Jun 27 '22 at 08:48

1 Answers1

2

You get the output as a byte string.

If you need to get the output as a Unicode string, decode the bytes:

reason = result.stdout.decode('utf-8')

See the demo:

import subprocess
result = subprocess.run([r"grep -oP 'Message\s*\K.*?(?=\s*element)' /home/ubuntu/varout.txt"], shell=True, capture_output=True)
print(result.stdout.decode('utf-8'))
## => : unable to locate

I improved the regex a bit as follows:

  • Message - matches a fixed string
  • \s* - zero or more whitespaces
  • \K - match reset operator that discards all text matched so far
  • .*? - any zero or more chars as few as possible
  • (?=\s*element) - a positive lookahead that matches a location that is immediately followed with zero or more whitespaces and an element substring.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563