You can identify every fifth line by comparing the linenumber modulo 5
against a number. In your case this should be 0
because you want the first line and the 6th, the 11th, ... (note that python starts with index 0)
To get the line-numbers as well as the content you can iterate over the file with enumerate
.
Then to discard the name:
part of the string and keep what comes after, you can use str.split()
.
A working implementation could look like this:
# Create an empty list for the names
names = []
# Opening the file with "with" makes sure it is automatically closed even
# if the program encounters an Exception.
with open('name_data.txt', 'r') as file:
for lineno, line in enumerate(file):
# The lineno modulo 5 is zero for the first line and every fifth line thereafter.
if lineno % 5 == 0:
# Make sure it really starts with "name"
if not line.startswith('name'):
raise ValueError('line did not start with "name".')
# Split the line by the ":" and keep only what is coming after it.
# Using `maxsplit=1` makes sure you don't run into trouble if the name
# contains ":" as well (may be unnecessary but better safe than sorry!)
name = line.split(':', 1)[1]
# Remove any remaining whitespaces around the name
name = name.strip()
# Save the name in the list of names
names.append(name)
# print out the list of names
print(names)
Instead of enumerate you could also use itertools.islice
with a step argument:
from itertools import islice
with open('name_data.txt', 'r') as file:
for line in islice(file, None, None, 5):
... # like above except for the "if lineno % 5 == 0:" line
Depending on your needs you might consider using the re
module to completly parse the file:
import re
# The regular expression
group = re.compile(r"name: (.+)\nfamily name: (.+)\nlocation: (.+)\nmembers: (.+)\n", flags=re.MULTILINE)
with open(filename, 'r') as file:
# Apply the regex to your file
all_data = re.findall(group, file)
# To get the names you just need the first element in each group:
firstnames = [item[0] for item in all_data]
The firstnames
will be ['Kelo', 'Miko']
for your example and similar if you use [item[1] for item in all_data]
then you get the last names: ['Lam', 'Naiton']
.
To successfully use a regular expression you must ensure it really matches your file layout otherwise you'll get wrong results.