For questions pertaining to SequenceMatcher from the python difflib module. This is a flexible class for comparing pairs of sequences of any type, so long as the sequence elements are hashable. difflib is part of the python standard library.
Questions tagged [sequencematcher]
72 questions
1
vote
1 answer
How to delete invalid characters between multiple strings in python?
I'm working in a project with OCR in Spanish. The camera captures different frames in a line of text. The line of text contains this:
Este texto, es una prueba del dispositivo lector para no videntes.
After some operations I get strings like…

Alex Ortega
- 45
- 11
1
vote
0 answers
Custom items for list alignment with SequenceMatcher
I am using SequenceMatcher for aligning two lists. Each lists' item is either tuple or integer. The requirement is, for a tuple that contains a particular integer is considered as equal.
For example:
(1, 2, 3) == 1 #True
(1, 2, 3) == 2 #True
To do…

jalal
- 83
- 1
- 7
0
votes
1 answer
How to return the most match value via SequenceMatcher
I have to match a product's category name returned from API response and product's category name from data base.
For example:
api_category = "packing tape",
category names from DB = ["packing material", "packaging equipment"]
from difflib import…

Irina_Xena
- 245
- 1
- 11
0
votes
0 answers
Python SequenceMatcher (difflib) not providing correct results for delete tag
I'm using SequenceMatcher to compare the output of usernames from an API list and an LDAP group. The intent is to add, and separately, remove users.
I've got the 'add' part working. I can't get the 'remove' part to give me the correct list of…

Dan
- 97
- 1
- 7
0
votes
1 answer
How can I compare one column of a dataframe to multiple other columns using SequenceMatcher?
I have a dataframe with 6 columns, the first two are an id and a name column, the remaining 4 are potential matches for the name column.
id name match1 match2 match3 match4
id name …

Sammyg
- 1
- 1
0
votes
0 answers
Difflib Sequence Matcher Algorithm
SequenceMatcher is a class available in python module named 'difflib.' It can be used for comparing pairs of input sequences. I'm writing a research paper for which I need the steps of the actual algorithm being used for this class. According to the…

Hamza
- 65
- 5
0
votes
1 answer
How to perform sequence matcher on dataframe values in a row in Python?
New to Python, so kind of figuring things out. I have a dataframe from an excel spreadsheet.
Something like this:
MANUFACTURER
MANUFACTURER PART NUMBER…

Jace
- 27
- 5
0
votes
0 answers
Difflib.SequenceMatcher not working in "IF" Statement?
I am executing a code with SequenceMatcher (Difflib library) nested in an "IF" Statement like this:
'''
from difflib import SequenceMatcher
string_one = 'He is right'
string_two = 'He was right'
print("It returns a ratio",…
0
votes
0 answers
SequenceMatcher ratio return
For example,
a = 'OrangeApple' and b = 'AppleOrange', after running SequenceMatcher(None, a, b).ratio() the returned ratio (similarity score) is 0.54.
If a = 'OrangeApple' and b = 'OrangeApple' the returned ratio is, as expected, 1.
I somehow…

singlequit
- 5
- 3
0
votes
0 answers
Compare two text columns to measure their similarity in a dataframe in python
I want to compare columns A with C and also B with C and measure each pair's similarity and then report the one that has a higher degree of similarity.
df = pd.DataFrame([['JAMES LIKEN', 'LINDEN R. EVANS', 'LINDEN R. EVANS'], ['HENRY THEISEN',…
0
votes
1 answer
Does SequenceMatcher is supported by chaquopy
does chaquopy support
from difflib import SequenceMatcher
or pip will be install first and what pip will be used to use the SequenceMatcher

mir shahab
- 45
- 7
0
votes
1 answer
How i match with best ratio of SequenceMatcher
I use the SequenceMatcher ratio to match two dataframe with the best ratio.
I want to check first if the score A and AA is good then check if the score between B is BB is good then if the score between C and CC is good, then I add the line
…

Créative-app
- 13
- 3
0
votes
2 answers
Find common fragments in multiple strings using SequenceMatcher
I would like to find common string between:
strings_list = ['PS1 123456 Test', 'PS1 758922 Test', 'PS1 978242 Test']
The following code returns only the first part "PS1 1", I would imagine the result is "PS1 Test". Could you help me, is it possible…

Elka
- 3
- 2
0
votes
1 answer
Similarity ratio from a list of excluded strings
In comparing the similarity of 2 strings, I want to exclude a list of strings, for example, ignore 'Texas', and 'US'.
I tried to use the argument 'isjunk' in Difflib's SequenceMatcher:
exclusion = ['Texas', 'US']
sr = SequenceMatcher(lambda x: x in…

Mark K
- 8,767
- 14
- 58
- 118
0
votes
0 answers
Comparing strings in python with tools as SequenceMatcher and textdistance and the difference in their algorithms
I am working with a dataframe which has 2 columns of city names which should be equal. But they are not due to administrative errors, spelling mistakes or name changes. I am trying to see when those city names are 'equal enough' to be assumed equal.…

Hestaron
- 190
- 1
- 8