How can I use Python's readlines function to format lines from a file in a specific pattern?

Question

The data I have in the test.txt file looks like this:

XXTYSVASOOXOSJAY
CGTVHIXXHIAICXWHAYX

and I'm trying to achieve a pattern like this:

XX-X
XX-X-X

This is what I have so far:

import re

data = open("test.txt", "r")
lines = data.readlines()

result = re.sub(r"[^X]+", r"-", str(lines)).strip("-")
if "X" in result:
  print(result)
else:
  print("No X found")

This is the result I get, it's a single line: XX-X-XX-X-X.

How can I do this correctly to get the expected result?

"How can I use readlines to format lines..." - `readlines` reads text from a file into a list of strings, it doesn't format anything. — mkrieger1, May 29 '23 at 19:24
Extending on mkrieger's comment, `str(lines)` is then converting the entire list into a single string, so `['abc', 'def', 'xyz']` becomes `"['abc', 'def', 'xyz']"`, and the `re.sub` operation is being done on that string, which is also removing the `[`,`]`, and `, `. Sedub's answer should work, or you could do `readlines` followed by `for line in lines: ` and then run `re.sub` on each line that way. — nigh_anxiety, May 29 '23 at 19:32

score 1 · Answer 1 · answered May 29 '23 at 19:20

To format lines from a file with a specific pattern, you can iterate over each line in the file and apply regular expression substitution to each line apart. For example

import re

with open("test.txt", "r") as f:
    for line in f:
        result = re.sub(r"[^X]+", r"-", line).strip("-")
        print(result if "X" in result else "No X found")

How can I use Python's readlines function to format lines from a file in a specific pattern?

1 Answers1