-2

I am solving the well known Edit Distance Dynamic Programing Problem.Actually the problem is given two strings string1 and string2 and given the cost for deletion,insertion and replacement of the character,I have to convert string1 to string2 in minimum cost.For DP I have to use a two dimensional array.For a small string (size<=10000) my code is working but for a larger input(size>=100000) the compiler says "array size is too large". If the problem has to be solved using dynamic programing(for input size=100000) then please tell me how should I handle this error.Here is my code.

#include <iostream>
#include <cstdio>
#include <stdlib.h>
#include <algorithm>
#include <map>
#include <queue>
#include <iomanip>
#include <string>
#include <math.h>
#include <limits>
#include <map>
#include <float.h>
#include <limits.h>
#include <string.h>
using namespace std;
#define rep(i,a,N) for(int i=a;i<N;++i)
int DP[10000][10000],delete_cost,add_cost,replace_cost;
string first,second;
int Min(int x,int y,int z){
    int min=x<y?x:y;
    min=min<z?min:z;
    return min;
}

int Transform(int i,int j){ 
    if(DP[i][j]!=-1){
        //printf("DP is set\n");
        return DP[i][j];
    }
    if(i==first.size())
        return (second.size()-j)*add_cost;
    if(j==second.size())
        return (first.size()-i)*delete_cost;
    if(first.at(i)!=second.at(j)){
        int add,del,rep;
        add=Transform(i,j+1)+add_cost;
        del=Transform(i+1,j)+delete_cost;
        rep=Transform(i+1,j+1)+replace_cost;
        return DP[i][j]=Min(add,del,rep);
    }
    else
        return DP[i][j]=Transform(i+1,j+1);

}
    int main(){
    int T,a,b,k,ans;
    scanf("%d",&T);

    while(T--){
        memset(DP,-1,sizeof(DP));
        cin>>first;
        cin>>second;
        scanf("%d %d %d",&a,&b,&k);
        add_cost=a;
        delete_cost=b;
        replace_cost=k;
        //ans=Transform(0,0);
        //if(ans<=k)
            printf("%d\n",ans );
        //else
        //  printf("%d\n",-1);
    }
return 0;
}
Aadil Ahmad
  • 139
  • 9
  • possible duplicate of [how to differentiate two very long strings in c++?](http://stackoverflow.com/questions/26202686/how-to-differentiate-two-very-long-strings-in-c) – Paweł Stawarz Oct 07 '14 at 20:16
  • As one of the answers in the suggested duplicate (although I think this is the clearer question - maybe it should be duped with this one) points out, if you just want the _distance_, you don't need the full NxM matrix; you just need the previous row/column, so you can update the recurrence. It turns out you can reconstruct the edits in linear memory as well, but that's a little more subtle. – Jonathan Dursi Oct 07 '14 at 21:11

1 Answers1

0

Right, because your 100000 * 100000 array of 32-bit integers takes up 40 gigabytes of memory. You need to use a different algorithm. If you only need to compute the edit distance up to a certain maximum k, there is a modified version of the classic algorithm that only uses O(n * (2k + 1)) storage (where n is the string length) because it only uses the middle 2k + 1 diagonals of the dynamic programming matrix.

japreiss
  • 11,111
  • 2
  • 40
  • 77