Questions tagged [difflib]

A python module, provides tools for computing and working with differences between sequences, especially useful for comparing text. Includes functions that produce reports using several common difference formats.

A python module which provides classes and functions for comparing sequences. It can be used for example, for comparing files, and can produce difference information in various formats, including HTML and context and unified diffs.

341 questions
0
votes
1 answer

Different lines between two files, when one line contains trailing whitespace (Python, difflib)

I want to compare two text files in Python, and return the lines that are different. My attempt uses difflib, but I'm open to other suggestions. I need to get the lines that are different, as well as the lines that appear in one file but not the…
user2524282
  • 305
  • 1
  • 4
  • 13
0
votes
2 answers

Getting a TypeError: 'float' object is not iterable when using a list of strings

Im trying to get the closest match between two lists of strings (listA and listB) to create a listC. The purpose for that is because I have to clean a dataframe that has one column of strings which each string represent a fruit which some entries…
0
votes
1 answer

Why is this string comparison not working? (difflib)

This code should be printing "I am unit Alpha 07" when the user says something like "what's your name", but for some reason the if statement never returns true. Please help! import difflib while True: talk = input("Please say something >…
0
votes
1 answer

How to Count Added or Deleted Words in two Strings in Python?

There are a lot of threads on how to check the difference in characters between two strings using difflib, but I specifically want to know if there is a way or a module that can tell me the words deleted and added between two strings. For example,…
0
votes
0 answers

iterating through difflib.get_close_matches parameters in python 3

I am trying to iterate through the "cutoff" parameter of the difflib.get_close_matches using the following code: import difflib from numpy import loadtxt text_file = open('filewithtext.txt', "r") lines = text_file.read().split(',') word =…
Eyal
  • 3
  • 2
0
votes
0 answers

Efficiently compare large text files?

I am trying to compare two text files of about 1MB each in Python using difflib's SequenceMatcher. I find that it gives a really poor time complexity when comparing files of this size taking up to 7 minutes last time I ran it. Is there a more…
Caoimhe
  • 9
  • 4
0
votes
3 answers

Python find similar sequences in string

I want a code to return sum of all similar sequences in two string. I wrote the following code but it only returns one of them from difflib import SequenceMatcher a='Apple Banana' b='Banana Apple' def similar(a,b): c =…
Mostafa Ghafoori
  • 187
  • 2
  • 3
  • 13
0
votes
2 answers

different values returned depending on where the line is

I'm working on using ndiff to check the diffs between two text files and also calculate how many diffs were found. At somepoint, I've found that I was receiving two different values depending on where the line of code was written... Can anyone bring…
Mr.Z.68
  • 3
  • 1
0
votes
1 answer

Python difflib How it works on list?

I'm new to Python. I wrote a simple script which opens a file and with a function it appends some of the line to a generator object. Then I use this object to make a difference with another file read the same way. I got the following error:…
Minee
  • 408
  • 5
  • 12
0
votes
1 answer

Python difflib: sequence similarity above cutoff point, but no result on get_close_matches()

So i'm using difflib to find same streets written down in different formats. Here's the one pair that really bugs me: '1-й Лихачевский переулок' and 'Переулок Лихачевский 1-й'. I calculate the sequence similarity like this: s =…
Huita
  • 15
  • 3
0
votes
3 answers

Finding similar strings with restricted alpha characters using Python

I want to group similar strings, however, I would prefer to be smart to catch whether conventions like '/' or '-' are diverged instead of letter differences. Given following input: moose mouse mo/os/e m.ouse alpha = ['/','.'] I want to group…
hurturk
  • 5,214
  • 24
  • 41
0
votes
1 answer

difflib - prevent replacement of whole line

Comparing the following examples of using difflib.ndiff() from difflib import unified_diff, ndiff print("".join(ndiff( ["aba\n"], ["abbba\n"] ))) print("".join(ndiff( ["aba\n"], ["abbbba\n"] ))) Output: - aba + abbba ? ++ -…
Fabian N.
  • 3,807
  • 2
  • 23
  • 46
0
votes
1 answer

how to get the differences in a list

Below are the two lists of tuples: text1_lines = [('a','1'), ('b', '2'), ('c','3')] text2_lines = [('a','4'), ('z', '5'), ('c','6')] I can get the differences with below code d = difflib.Differ() diff = d.compare(text1_lines,text2_lines) diff_list…
Ron
  • 51
  • 1
  • 1
  • 5
0
votes
1 answer

Python Difflib - How to get diff sequneces like PHP Diff Class format

I am reading the python difflib documentation. According to the difflib.differ output is : Code Meaning '- ' line unique to sequence 1 '+ ' line unique to sequence 2 ' ' line common to both sequences '? ' line not present in either…
0
votes
2 answers

Find new inserted words in text file

I want to find the new words which are inserted into a text file using Python. For example: Old: He is a new employee here. New: He was a new, employee there. I want this list of words as output: ['was', ',' ,'there'] I used difflib but it gives…
Hellboy
  • 1,199
  • 2
  • 15
  • 33