I have a huge(1.5 GB) tsv (tab-separated value) file that i'm processing using python, the file is line based but it has some ill-formatted lines which i wish to skip, my code is as follows:
fo = open(output, 'w')
with open(filename) as f:
i = 0
for line in f:
print i
try: #to account for the ill-formatted lines
user_hash, artist_hash, artist, playcount = line.split('\t')
fo.write('{0}\t{1}\t{2}'.format(hash_map[user_hash], artist, playcount))
i = i+1
except:
print "error in user_hash : " + user_hash
continue
now the problem is the program terminates execution as soon as it catches the first exception, it just prints "error in user_hash" then exists. It should have continued because i know that the file has 17 million+ lines and the i only reached 433919.
Why is this happening ?
Thanks for reading.