Your problem is that you only call addSentence()
at the EOF, so it doesn't magically get to see anything before you have read the whole file. Presumably, you need to call it when you detect the end of a sentence (with the test for '.'
, '?'
or '!'
— you'll also need to null terminate the string before calling addSentence and reset the memory with a new allocation and the correct size) as well as at EOF. It's not clear why you have two loops; you could miss some newlines as end of sentence. Rework with just one loop.
It's not entirely clear if newlines mark the ends of sentences. This revision assumes that they do:
int maxSize = 256;
int currSize = maxSize;
int i = 0;
int c;
char *sentence = (char*)malloc(maxSize);
assert(sentence != 0); // Not a production-ready error check
while ((c = fgetc(input)) != EOF)
{
sentence[i++] = c;
if ((c == '\n') || (c == '.') || (c == '?') || (c == '!'))
{
if (c != '\n')
sentence[i++] = '\n';
sentence[i] = '\0';
addSentence(sentence);
sentence = malloc(maxSize);
assert(sentence != 0); // Not a production-ready error check
currSize = maxSize;
i = 0;
}
if (i == currSize)
{
currSize = i + maxSize;
sentence = (char*)realloc(sentence, currSize);
assert(sentence != 0); // Not a production-ready error check
}
}
sentence[i] = '\0';
addSentence(sentence);
Note that the error checking for failed memory allocation is not production quality; there should be some proper, unconditional error checking. There is a small risk of buffer overflow if the end of sentence punctuation falls in exactly the wrong place. Production code should avoid that, too, but it would be fiddlier. I'd use a string data type and a function to do the adding. I'd probably also take a guess that most sentences are shorter than 256 characters (especially if newlines mark the end), and would use maxSize
of 64
. It would lead to less unused memory being allocated.