I'm new in programming in C and I got 2 problems. I have two string, then I need to split them into words and find out if both strings contains same words. I will explain it with my code.
Input:
"He said he would do it."
"IT said: 'He would do it.'"
This two string are placed into two arrays. At first I need to parse words from others characters.
Call function:
char ** w1 = parse(s1, &len1);
Variable len
counts number of rows (words).
Function parse:
char ** parse(char *w, int * i)
{
int j = 0, y, dupl = 0; //variables for cycles and duplicate flag
char d[] = " <>[]{}()/\"\\+-*=:;~!@#$%^&_`'?,.|";
char* token = strtok(w, d);
unsigned len = strlen(token), x;
unsigned plen = len;
char ** f = (char**) malloc(len * sizeof (char*));
while (token != NULL)
{
len = strlen(token);
for (x = 0; x < len; x++)
{
token[x] = tolower(token[x]);
}
for (y = 0; y < *i; y++) //cycle for deleting duplicated words
{
if (equals(token, f[y]) == 1)
{
dupl = 1; break;
}
}
if (dupl == 1)
{
token = strtok(NULL, d);
dupl = 0;
continue;
}
if (len >= plen)
{
f = (char**) realloc(f, (len+1) * sizeof (char*));
plen = len;
}
else
f = (char**) realloc(f, (plen+1) * sizeof (char*));
f[j] = token;
token = strtok(NULL, d);
*i = *i + 1;
j++;
}
free(token);
return f;
}
Ok, now i have 2x 2Darrays, then just sort it (qsort(w1, len1, sizeof (char*), cmp);
) and compare it:
for (i = 0; i < len2; i++)
if (equals(w1[i], w2[i]) == 0)
return 0;
Function equals:
int equals(char *w1, char *w2)
{
if (strcmp(w1, w2) == 0)
return 1;
return 0;
}
I know that all of this can be faster, but at first I need to solve my problem. This works for input which I wrote at the beginning, but when I type something long e.g.500 characters, my result is Aborted
. I think that the problem is here:
f = (char**) realloc(f, (len+1) * sizeof (char*));
but dunno why. Second thing is, that I can't free my arrays. This
void Clear (char ** w, int max)
{
int i;
for (i = 0; i < max; i++)
free(w[i]);
free(w);
}
gives me a segmentation fault. Thanks for your time and I'm sorry for my bad english and bad programming skills.