I am trying to automate a process that is normally done by hand. There is a database full of unique keys that gets updated quarterly. This update will have some of the same exact keys and sometimes it won't but they will be similar. They will add something to the front and end or add some zeros. Here is an example of some of the already established keys and their matches:
0509000004 -> 594
0509000005 -> 595
0509000006 -> 596
0510000003 -> 5103
0311100000004000 -> 099031110000000400
0311100000004020 -> 099031110000000402
063050000100 -> 63500100
06308C000400 -> 638C0400
So as you can see there is always variation to the way the key is changed and the next time I get data it can have a few more variations. I wish I could tell the provider of the data to stop changing it but that is not up to me.
I have tried using the Levenshtein Distance algorithm like fuzzy searching in conjunction with stripping of certain numbers but it does not always get it correctly due to the large variation.
Any suggestions would be much appreciated!