1

The data I have in the test.txt file looks like this:

XXTYSVASOOXOSJAY
CGTVHIXXHIAICXWHAYX

and I'm trying to achieve a pattern like this:

XX-X
XX-X-X

This is what I have so far:

import re

data = open("test.txt", "r")
lines = data.readlines()

result = re.sub(r"[^X]+", r"-", str(lines)).strip("-")
if "X" in result:
  print(result)
else:
  print("No X found")

This is the result I get, it's a single line: XX-X-XX-X-X.

How can I do this correctly to get the expected result?

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
  • 1
    "How can I use readlines to format lines..." - `readlines` reads text from a file into a list of strings, it doesn't format anything. – mkrieger1 May 29 '23 at 19:24
  • Extending on mkrieger's comment, `str(lines)` is then converting the entire list into a single string, so `['abc', 'def', 'xyz']` becomes `"['abc', 'def', 'xyz']"`, and the `re.sub` operation is being done on that string, which is also removing the `[`,`]`, and `, `. Sedub's answer should work, or you could do `readlines` followed by `for line in lines: ` and then run `re.sub` on each line that way. – nigh_anxiety May 29 '23 at 19:32

1 Answers1

1

To format lines from a file with a specific pattern, you can iterate over each line in the file and apply regular expression substitution to each line apart. For example

import re

with open("test.txt", "r") as f:
    for line in f:
        result = re.sub(r"[^X]+", r"-", line).strip("-")
        print(result if "X" in result else "No X found")
sedub01
  • 71
  • 6