I don't fully understand your code (for example, why do you increment i
and j
inside the loop?). But the main problem is that you have a nested loop, which makes the runtime of the algorithm O(n^2), i.e. if the file becomes 10 times as large, the execution time will become (approximately) 100 times as long.
So you need a way to avoid that. One possible way is to store the lines in a smarter way, so that you don't have to walk through all lines every time. Then the runtime becomes O(n). In this case you can use the fact that anagrams consist of the same characters (only in a different order). So you can use the "sorted" variant as a key in a dictionary to store all lines that can be made from the same letters in a list under the same dictionary key. There are other possibilities of course, but in this case I think it works out quite nice :-)
So, fully working example code:
#!/usr/bin/env python3
from collections import defaultdict
d = defaultdict(list)
with open('file.txt') as file:
lines = [line.strip() for line in file]
for line in lines:
sorted_line = ''.join(sorted(line))
d[sorted_line].append(line)
anagrams = [d[k] for k in d if len(d[k]) > 1]
# anagrams is a list of lists of lines that are anagrams
# I would say the number of anagrams is:
count = sum(map(len, anagrams))
# ... but in your example you're not counting the first words, only the "duplicates", so:
count -= len(anagrams)
print('There are', count, 'anagram words')
UPDATE
Without duplicates, and without using collections (as requested by OP in a comment, although I strongly recommend to use it):
#!/usr/bin/env python3
d = {}
with open('file.txt') as file:
lines = [line.strip() for line in file]
lines = set(lines) # remove duplicates
for line in lines:
sorted_line = ''.join(sorted(line))
if sorted_line in d:
d[sorted_line].append(line)
else:
d[sorted_line] = [line]
anagrams = [d[k] for k in d if len(d[k]) > 1]
# anagrams is a list of lists of lines that are anagrams
# I would say the number of anagrams is:
count = sum(map(len, anagrams))
# ... but in your example your not counting the first words, only the "duplicates", so:
count -= len(anagrams)
print('There are', count, 'anagram words')