0

I'm reading a file line by line and wanting to grab the things that I want. I look for a keyword in the line then read it character by character right now. In C/C++ I would just throw the string in a for loop and iterate through it saying

This is the code I have so far.

i = 0

with open("test.txt") as f:
    for line in f:
        if "test" in line:
            for character in line:
                if character == "\"":
                   //append all characters to a string until the 2nd quote is seen

Any ideas?

jtor
  • 133
  • 1
  • 4
  • 13

1 Answers1

1

Try this:

in_string = False
current_string = ""
strings = []

with open("test.txt") as f:
    for line in f:
        if "test" in line:
            for character in line:
                if character == '"':
                    if in_string:
                        strings.append(current_string)
                    in_string = not in_string
                    current_string = ""
                    continue
                elif in_string:
                    current_string += character

It iterates through all the characters in the lines, then if it is a " or ', it starts collecting the proceeding characters into a string, or it stops and appends to a list the collected string.

Or, with regex:

import re
strings = []

with open("test.txt") as f:
    for line in f:
        if "test" in line:
            strings.extend(re.findall(r'"(.*?)"', line, re.DOTALL))
  • Thanks, this should be enough to get me started. – jtor Jan 24 '15 at 16:48
  • You can do this in much simpler way without RE. Unfortunately the thread got closed before I could post my answer. Solution: `splitted = open('test.txt').read().split('\"')` `quotted = [ splitted[i] for i in range(1, len(quote_split), 2) ]` – Pithikos Jan 24 '15 at 16:59
  • @Pithikos This would not work if there are no quotes and will have stuff from the last opened quote even if it doesn't close. Also, [jtor](http://stackoverflow.com/users/3517223/jtor) is checking if `test` is in the line before checking for quoted words, which your method can't do. –  Jan 24 '15 at 17:05