0

I have a case where I need to calculate the Levenshtein distance between two columns for a row in MySQL. There are UDF's available for this, but I need to do this without a UDF. The reason for this is that I am using MemSQL, which is an extremely fast in-memory database, but does not support UDFs - but it does support nearly any query you can run in MySQL. Is anyone aware of a non-udf implementation of the Levenshtein distance algorithm as a query? Something like the following UDF:

http://www.artfulsoftware.com/infotree/qrytip.php?id=552

I'm working on converting this myself as well. I'm open to other solutions as well (aka, other ways to make this happen in MemSQL).

Note: I cannot using Hamming distance. That would be simpler, but the use case calls for Levenshtein distance.

user396404
  • 2,759
  • 7
  • 31
  • 42
  • 1
    The attached link is incorrect - can you edit and update with the correct link? – Carl Sverre Feb 03 '17 at 23:52
  • Do you mean "a query" as in "one query" or as in "a series of queries"? I don't think it is possible to do it in one query, even if you do it in a matrix (look up any implementation in "c", the array can be a table with the columns `wordid, i, j`, but it will need A LOT of memory). You can of course implement the function as a series of updates: add some columns to your table (using these as variables), then update step by step. But this would mean to use some kind of procedure in your client app, and then I guess it would be easier to calculate the distance in the client directly. – Solarflare Feb 04 '17 at 10:50
  • Link has been fixed. It would be in one query - therein lies the puzzle here. I'm not sure if it's possible either, but trying to figure out a solution. – user396404 Feb 04 '17 at 18:49

0 Answers0