-2

last three days I have a problem.. I have a file containing sentences.

When I'm reading file with

int maxSize = 256;
int currSize = 0;
int i = 0;
char *sentence = (char*)malloc(maxSize);
char c;

currSize = maxSize;

while ((c = fgetc(input)) != EOF)
{
    sentence[i++] = c;

    while((c = fgetc(input)) != '\n')
    {
        sentence[i++] = c;

        if((c == '.') || (c == '?') || (c == '!'))
            sentence[i++] = '\n';

        if(i == currSize)
        {
            currSize = i + maxSize;
            sentence = (char*)realloc(sentence,currSize);
        }
    }
}

sentence[i] = '\0';

addSentence(sentence);

when function addSentence is adding sentences into linked list there is problem because it only add one sentence made from all what is in the file...

I'm beginner in C. Thank you.

  • So basically you are saying is that the problem lies in `addSentence`, and is not related to the function you are showing? – Jongware Apr 15 '16 at 23:48
  • 1
    Use `int c;` instead of `char c;` because `fgetc()` returns an `int` which can hold any value that a `char` can hold plus one extra one — EOF. – Jonathan Leffler Apr 15 '16 at 23:51
  • It's related to the function. It's about \n but I don't know how to fix it :/ – Jacob Pascal Apr 15 '16 at 23:52
  • Your problem is that you only call `addSentence()` at the EOF, so it doesn't magically get to see anything before you have read the whole file. Presumably, you need to call it when you detect the end of a sentence (with the test for `'.'`, `'?'` or `'!'` — you'll also need to null terminate the string before calling `addSentence` and reset the memory with a new allocation and the correct size afterwards) as well as at EOF. It's not clear why you have two loops; you could miss some newlines as end of sentence. Rework with just one loop. – Jonathan Leffler Apr 15 '16 at 23:54
  • I had one while cycle but that cycle was reading only one line so I wanted to read all lines and that's that crap function.. – Jacob Pascal Apr 15 '16 at 23:58

1 Answers1

1

Your problem is that you only call addSentence() at the EOF, so it doesn't magically get to see anything before you have read the whole file. Presumably, you need to call it when you detect the end of a sentence (with the test for '.', '?' or '!' — you'll also need to null terminate the string before calling addSentence and reset the memory with a new allocation and the correct size) as well as at EOF. It's not clear why you have two loops; you could miss some newlines as end of sentence. Rework with just one loop.

It's not entirely clear if newlines mark the ends of sentences. This revision assumes that they do:

int maxSize = 256;
int currSize = maxSize;
int i = 0;
int c;
char *sentence = (char*)malloc(maxSize);
assert(sentence != 0);  // Not a production-ready error check

while ((c = fgetc(input)) != EOF)
{
    sentence[i++] = c;

    if ((c == '\n') || (c == '.') || (c == '?') || (c == '!'))
    {
        if (c != '\n')
            sentence[i++] = '\n';
        sentence[i] = '\0';
        addSentence(sentence);
        sentence = malloc(maxSize);
        assert(sentence != 0);  // Not a production-ready error check
        currSize = maxSize;
        i = 0;
    }

    if (i == currSize)
    {
        currSize = i + maxSize;
        sentence = (char*)realloc(sentence, currSize);
        assert(sentence != 0);  // Not a production-ready error check
    }
}

sentence[i] = '\0';
addSentence(sentence);

Note that the error checking for failed memory allocation is not production quality; there should be some proper, unconditional error checking. There is a small risk of buffer overflow if the end of sentence punctuation falls in exactly the wrong place. Production code should avoid that, too, but it would be fiddlier. I'd use a string data type and a function to do the adding. I'd probably also take a guess that most sentences are shorter than 256 characters (especially if newlines mark the end), and would use maxSize of 64. It would lead to less unused memory being allocated.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • It's adding empty sentences into linked list :/ – Jacob Pascal Apr 16 '16 at 00:17
  • OK; so add a check on the length (and/or content) of the sentence before adding it. That would happen if you have a full stop (period) at the end of a line, for example. You might decide that a sentence full of spaces isn't interesting. You just have to think your way through what is happening. And decide what the desired behaviour is. If it were my program, I'd probably replace newlines with blanks, and only add sentences on encountering suitable punctuation. I'd probably also worry about "quoted phrases." where the full stop is immediately followed by a double quote (or a single quote). – Jonathan Leffler Apr 16 '16 at 00:21
  • (Parenthetical sentences would be problematic too.) Not to mention … pauses in the middle of a sentence. That's a cheat; it's a Unicode 'ellipsis'; you could also have ... (written out the old-fashioned way). What about Mr. Pascal? Is that one or two sentences? That one's really tricky. One of the nice things about programming is that you get to use your brain. – Jonathan Leffler Apr 16 '16 at 00:23
  • Ok I fixed it now I will work on these tricky things as you said :) Thank you so much – Jacob Pascal Apr 16 '16 at 00:33