How ro only read the content in the middle of separator?

Question

I have a txt file,

aaaa
#
bbbb
cccc
#
dddd

I only want to read the content(bbbb,cccc) line by line in the middle of '#',my code need two loops, first find the begin and end, then add for-loop in finding content, is there better way can only use one loop? because my content in txt is very long, so two-loops is not efficient...

with open("test.txt", "r") as f:
        lines = f.read().splitlines() 
begin = 0 
for i in range (0, len(lines)):
        if '#' in lines[i]:
           if i > begin:
               begin = i
           if begin != 0:
               end  = i
output = []
for i in range(begin + 1, end):
        output.append(lines[i])

What if there's more than two `#`, or make it worse, odd number of `#`? — Chris, Mar 21 '22 at 06:42
Have you tested that code on your input? It will produce an empty list. Your if statement logic for setting begin and end needs some changes. — kcsquared, Mar 21 '22 at 06:49

score 2 · Answer 1 · answered Mar 21 '22 at 06:49

I'd probably use a generator to avoid keeping all the data in memory. In the generator, don't start yielding results until you read the first instance of the # line. Then yield lines until you reach the next # line.

def read_content(filename):
    with open(filename, "r") as file:
        while file.readline().strip() != "#": pass
        while (line := f.readline().strip()) != "#": yield line


for line in read_content("test.txt"):
    print(line)

score 2 · Answer 2 · answered Mar 21 '22 at 06:50

2

One way using itertools.dropwhile and takewhile:

from itertools import dropwhile, takewhile

func = lambda x: not x.startswith("#")

with open("test.txt") as f:
    before = dropwhile(func, f)
    _ = next(before) # To consume the first # line
    content = list(takewhile(func, before))
    content

Output:

['bbbb\n', 'cccc\n']

Logic:

This is to drop until the first line that starts with #, than consume the very first line with #, and finally take everything until the next line with #.

answered Mar 21 '22 at 06:50

Chris

29,127
3
28
51

1

Personal preference, I don't assign function results to the `_` variable unless I have to. Here you could just call `next` without assigning to a var. Keeps objects alive for longer and is IMO more confusing to read. – flakes Mar 21 '22 at 06:52
That said, I've never used a lot of the helpers you incorporate here, and they look super neat. Functional style programming is very interesting. – flakes Mar 21 '22 at 06:56
1

@flakes Hmm I only used `_` to show that the result of `next` is _unused_; for which `_` is the most often used. But perhaps given the comment at the same line, it might have been unnecessary. – Chris Mar 21 '22 at 07:01

How ro only read the content in the middle of separator?

2 Answers2