I haven't found anything relevant on Google, so I'm hoping to find some help here :)
I've got a Python list as follows:
[['hoose', 200], ["Bananphone", 10], ['House', 200], ["Bonerphone", 10], ['UniqueValue', 777] ...]
I have a function that returns the Levenshtein distance between 2 strings, for House and hoose it would return 2, etc.
Now I want to merge list elements that have a levenshtein score of f.e. <5, while (!) adding their scores, so for the resulting list I want the following:
[['hoose', 400], ["Bananaphone", 20], ['UniqueValue', 777], ...]
or
[['House', 400], ["Bonerphone", 20], ['UniqueValue', 777], ...]
etc.
It doesn't matter as long as their values get added.
There will only ever be 2 items in the list that are very similar, so a chain effect of any one item similar to a lot of others eating them all up isn't expected.