First example:
one = ['billy', 'sally', 'gd', 'kk', 'btb']
two = ['billy', 'sally', 'hh', 'kk', 'ff', 'btb']
opcodes1 = SequenceMatcher(None, one, two).get_opcodes()
opcodes2 = SequenceMatcher(None, two, one).get_opcodes()
correctly returns the insert
ff:
[('equal', 0, 5, 0, 5), ('replace', 5, 6, 5, 6), ('equal', 6, 9, 6, 9), ('insert', 9, 9, 9, 11), ('equal', 9, 10, 11, 12)]
[('equal', 0, 5, 0, 5), ('replace', 5, 6, 5, 6), ('equal', 6, 9, 6, 9), ('delete', 9, 11, 9, 9), ('equal', 11, 12, 9, 10)]
Now, I would like get_opcodes()
to find a 'insert'
which is next to a 'replace'
... but it is unable.
Second example:
one = ['billy', 'sally', 'gd', 'kk', 'btb']
two = ['billy', 'sally', 'hh', 'kk1', 'ff', 'btb']
opcodes1 = SequenceMatcher(None, one, two).get_opcodes()
opcodes2 = SequenceMatcher(None, two, one).get_opcodes()
returns:
[('equal', 0, 2, 0, 2), ('replace', 2, 4, 2, 5), ('equal', 4, 5, 5, 6)]
[('equal', 0, 2, 0, 2), ('replace', 2, 5, 2, 4), ('equal', 5, 6, 4, 5)]
In this next example we force the difference to be recognized. I've added padding ... which amazingly is ignored ... this is so amazing because the 'kk'
in the first example is acting as padding, stopping the 'gd'
vs 'hh'
from being considered part of the 'ff'
insert
Third example:
one = ['///////', 'billy', '///////', 'sally', '///////', 'gd', '///////', 'kk', '///////', 'btb']
two = ['///////', 'billy', '///////', 'sally', '///////', 'hh', '///////', 'kk1', '///////', 'ff', '///////', 'btb']
opcodes1 = SequenceMatcher(None, one, two).get_opcodes()
opcodes2 = SequenceMatcher(None, two, one).get_opcodes()
returns:
[('equal', 0, 5, 0, 5), ('replace', 5, 6, 5, 6), ('equal', 6, 7, 6, 7), ('replace', 7, 8, 7, 10), ('equal', 8, 10, 10, 12)]
[('equal', 0, 5, 0, 5), ('replace', 5, 6, 5, 6), ('equal', 6, 7, 6, 7), ('replace', 7, 10, 7, 8), ('equal', 10, 12, 8, 10)]
Once again, failing to recognize the insert value ff when it is clearly there.