I'm wondering what would be the best way to delete lines from a tabular text (while keeping the header) so that only specific entries that contain a word are in the tabular format.
Say for example, I have a tabular text file with animals and their names and ages. (The headers are Animals/Names/Ages.) How could I delete all lines that do not have 'Dog' in the 'Animal' heading?
Animals Names Ages
Dog Pippin 10
Dog Merry 14
Dog Frodo 12
Cat Sauron 11
Bird Gandalf 10
Bird Mordor 12
and I only want: Animals Names Ages
Dog Pippin 10
Dog Merry 14
Dog Frodo 12
I have my example code below with notes:
import os
headers = 1
field1 = 'ANIMALS'
sep = ' '
def getIndex(delimString, delimiter, name):
'''Get position of item in a delimited string'''
delimString = delimString.strip()
lineList = delimString.split(delimiter)
index = lineList.index(name)
return index
infile = 'C:/example'
outfile = 'C:/folder/animals'
try:
with open(infile, 'r') as fin:
with open(outfile, 'w') as fout:
for i in range(headers):
line = fin.readline()
fout.write(line)
line = fin.readline()
fout.write(line)
# This is where I get confused, I try using the method below:
for line in fin:
lineList = line.split(sep)
# But the code doesn't work as it only prints the header
# I have a feeling it's the way I'm phrasing this area
if field1 == 'DOG':
fout.write(line)
print '{0} created.'.format(outfile)
except IOError:
print "{0} doesn't exist- send help".format(infile)
What is the best way to selectively print items on a tabular .txt file?