3

I'm playing around with difflib in Python and I'm having some difficulty getting the output to look good. For some strange reason, difflib is adding a single whitespace before each character. For example, I have a file (textfile01.txt) that looks like this:

test text which has no meaning

and textfile02.txt

test text which has no meaning

but looks nice

Here's a small code sample for how I'm trying to accomplish the comparison:

import difflib

handle01 = open(text01.txt , 'r')
handle02 = open(text02.txt , 'r')

d = difflib.ndiff( handle01.read() , handle02.read() )
print "".join(list(diff))

Then, I get this ugly output that looks...very strange:

t e s t t e x t w h i c h h a s n o m e a n i n g-

- b- u- t- - l- o- o- k- s- - n- i- c- e

As you can see, the output looks horrible. I've just been following basic difflib tutorials I found online, and according to those, the output should look completely different. I have no clue what I'm doing wrong. Any ideas?

Community
  • 1
  • 1
erichar7
  • 89
  • 6

1 Answers1

8

difflib.ndiff compares lists of strings, but you are passing strings to them — and a string is really a list of characters. The function is thus comparing the strings character by character.

>>> list(difflib.ndiff("test", "testa"))
['  t', '  e', '  s', '  t', '+ a']

(Literally, you can go from the list ["t", "e", "s", "t"] to the list ["t", "e", "s", "t", "a"] by adding the element ["a"] there.

You want to change read() to readlines() so you can compare the two files in a linewise fashion, which is probably what you were expecting.

You also want to change "".join(... to "\n".join(... in order to get a diff-like output on screen.

>>> list(difflib.ndiff(["test"], ["testa"]))
['- test', '+ testa', '?     +\n']
>>> print "\n".join(_)
- test
+ testa
?     +

(Here difflib is being extra nice and marking the exact position where the character was added in the ? line.)

badp
  • 11,409
  • 3
  • 61
  • 89
  • That fixed it. I didn't realize it was looking for a list of strings. Most of the examples I was looking at "appeared" to be using normal strings. Thanks for your assistance! – erichar7 Jan 19 '15 at 22:31
  • indeed, the officiel example on the official handbook use string...https://docs.python.org/3/library/difflib.html#difflib.Differ – Maïeul May 14 '21 at 09:39