SCENARIO: Given 2 input strings I need to find minimum number of insertions deletions and substitutions required to convert one string to other. The strings are text from 2 files. The comparison has to be done at word level.
What i have done is implemented edit distance algorithm which does the job well using a 2-dimensional array of size (m*n) where sizes of input strings are m and n.
The PROBLEM i am facing is if the value of m and n becomes large, say more than 16,000 i am getting OutOfMemory exception due to the large size of m*n array. Also i am running into memory fragmentation and LargeObjectHeap issues
QUESTION Looking for a C# code to solve edit distance problem for 2 very large sized strings (each containing more than 20k words) without getting OutOfMemory exception.
MapReduce or DataBase or MemoryMappedFile related solutions not feasible. Only a pure C# code will work.