I need to compare two .csv files (files are over 65000 lines). Find lines that are not in the second file. I am using difflib.ndiff:
for line in difflib.ndiff(text1, text2):
print(line,)
But I get unexpected results. The function finds two identical strings and marks them as different:
+ Gr4,DQ_3Gb_1m_DR_926_23489,100,,,70,,
- Gr4,DQ_3Gb_1m_DR_926_23489,100,,,70,,
- What could be the problem?
- What might be a suitable way to find the differences?
2.
from itertools import izip_longest
l1 = map(lambda x: x.strip(), list(open('test1.txt')))
l2 = map(lambda x: x.strip(), list(open('test2.txt')))
diff_list = izip_longest(l1, l2)
for diff in diff_list:
print '%s %s %s' % (
diff[0] or '',
'==' if diff[0] == diff[1] else '!=',
diff[1] or '',
)
I tried to use the following code to compare files, but I got the same unexpected result, why is this so?
Gr4,DQ_1Gb_1m_DR_926_23486,100,,,70,,!=Gr4,DQ_3Gb_1m_DR_926_23489,100,,,70,,
Gr4,DQ_3Gb_1m_DR_926_23489,100,,,70,,!=Gr4,DQ_1Gb_1m_DR_926_23486,100,,,70,,