1

I need to make a C program that removes empty lines as homework since we didn't study a way to removing characters from files my first attempt was to overwrite all the characters but both fprintf and fputc inserts characters.

#include <stdio.h>

#define MAX_SIZE 1000

int main() {
    FILE *fp = fopen("sortie.txt", "r+");
    int off = 0;
    for (char c1 = '\n', c2;;) {
        if (((c2 = fgetc(fp)) == '\n') && (c1 == '\n')) {
            off++;
            continue;
        }
        if (c2 == EOF) {
            fseek(fp, -off ,SEEK_CUR);
            fputc(EOF, fp);
            break;
        }
        //if(!off)continue;

        fseek(fp, -off, SEEK_CUR);
        fprintf(fp, "%c", c1 = c2);

        fseek(fp, off, SEEK_CUR);
    }
    fclose(fp);

    return 0;
}

second attempt was to replace them with '\0'

#include <stdio.h>

int main() {
    FILE *fp = fopen("sortie.txt", "r+");
    for (char c1 = '\n', c2;;) {
        if (((c2 = fgetc(fp)) == '\n') && (c1 == '\n')) {
            fseek(fp, -1, SEEK_CUR);
            fputc('\0', fp);
            fseek(fp, 1, SEEK_CUR);
        }
    }
    fclose(fp);

    return 0;
}

none worked

overwriting characters and replacing with 0

chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • 2
    The way forwards is to create a new file. Although it is possible to overwrite single characters, this isn't the way to remove an empty line. A `0` has no place it a text file. Another good reason to create another file, is if things go wrong part way, you are likely to trash the file you are attempting to modifiy. – Weather Vane Feb 25 '23 at 17:08
  • how do i replace single characters @WeatherVane – YellowFlash Feb 25 '23 at 17:47
  • As weather vane says, do *not* edit the file. As a very good general rule, treat files as immutable. Life is better in so many ways if you do that. Files get written once, and then they never change. Instead, create a new file and rename it when you are done. It is much simpler to code and results in a more robust process. – William Pursell Feb 25 '23 at 17:47
  • i didnt want to create a new file because we didnt study it in class and this is for homework so not very serious but thanks for pointing it out – YellowFlash Feb 25 '23 at 17:49
  • Please note that [`fgetc`](https://en.cppreference.com/w/c/io/fgetc) returns an **`int`**. Which is rather important for that comparison against the `int` value `EOF`. – Some programmer dude Feb 25 '23 at 17:51
  • @YellowFlash, your code *is* overwriting a single character. But you are not removing blank lines, just corrupting the file. – Weather Vane Feb 25 '23 at 17:54
  • `fputc(EOF, fp);` does not make any sense. `EOF` is not a character in the file. EOF is the value returnd by `fgetc` to indicate that the end of file was reached or an error occurred. It was not read from the file and should not be written to it. – William Pursell Feb 25 '23 at 17:54
  • @WeatherVane the resulting file had more characters – YellowFlash Feb 25 '23 at 17:55
  • @WilliamPursell oh i thought like in strings you terminate it with 0 – YellowFlash Feb 25 '23 at 17:56
  • Text files do not contain C strings. – Weather Vane Feb 25 '23 at 18:01
  • Even if you don't use two files, you can use two different FILE *s in the program. Open one with mode `"r"`, and open another with mode `"r+"`. Read from one and write to the other. After you close them both, `truncate`. This will be much simpler than fseeking. (But the file is easily corruptible) – William Pursell Feb 25 '23 at 18:34
  • @WilliamPursell actually my first attempt used two instances of the file one with "r" mode and the other with "w" mode but didnt work also dont know how to truncate – YellowFlash Feb 25 '23 at 19:06
  • Opening with mode `"w"` will truncate the file to size 0 (discard all data), so you should use `"r+"` for the writer. Use `truncate(path, length)` to fix the length. (If you delete N characters, the desired length will be the original length minus N) – William Pursell Feb 25 '23 at 19:14
  • @WilliamPursell can i use '"w"' and add only the characters that i want? like read from the instance with '"r"' mode and evaluate the characters and then add them to the other instance – YellowFlash Feb 25 '23 at 19:32
  • After `fopen(path, "w");`, there is no data in the file. If you want to write to the file and not throw away all the data, you have to open with `"r+"`. If you discard a character from the file, all of the characters in the file need to be moved, so yoou cannot "only add the characters that I want". You have to re-write everything after the first deletion. You *can* avoid overwriting if you are writing the character that is already there, but the logic to keep track of that is (probably) not worthwhile. – William Pursell Feb 25 '23 at 19:58
  • @WilliamPursell ok thank you so much one last thing i cant find how to truncate file – YellowFlash Feb 25 '23 at 22:53
  • @YellowFlash It is platform dependent: https://pubs.opengroup.org/onlinepubs/9699919799/functions/truncate.html – William Pursell Feb 26 '23 at 13:57

1 Answers1

0

In order to remove characters from a file, the length of the file must be reduced. There is no portable way to truncate a file, except to a size of 0.

A portable way to achieve your goal is to read the file contents in memory, perform whatever processing in memory and write back the new contents into a newly created file with the same name.

Here is an example with error checking:

#include <stdlib.h>

int main(void) {
    FILE *fp = fopen("sortie.txt", "r");
    if (fp == NULL) {
        perror("cannot open sortie.txt for reading");
        return 1;
    }
    char *buffer = NULL;
    size_t length = 0, size = 0;
    int c, last = '\n';
    while ((c = getc(fp)) != EOF) {
        if (c != '\n' || last != '\n') {
            if (length >= size) {
                size_t new_size = size + size / 2 + 32;
                char *new_buffer = realloc(buffer, new_size);
                if (new_buffer == NULL) {
                    perror("cannot reallocate buffer");
                    free(buffer);
                    fclose(fp);
                    return 1;
                }
                buffer = new_buffer;
                size = new_size;
            }
            buffer[length++] = (char)c;
        }
        last = c;
    }
    fclose(fp);
    fp = fopen("sortie.txt", "w");
    if (fp == NULL) {
        perror("cannot open sortie.txt for writing");
        free(buffer);
        return 1;
    }
    int status = 0;
    size_t written = fwrite(buffer, 1, length, fp);
    free(buffer);
    if (written != length) {
        fprintf(stderr, "error writing to sortie.txt: %zu written, expected %zu\n",
                written, length);
        status = 1;
    }
    if (fclose(fp)) {
        perror("error closing sortie.txt");
        status = 1;
    }
    return status;
}

The problem with this approach is the potential data loss if the program is interrupted at the wrong time before the updated contents is fully written.

For an alternative, more classic approach, which may handle files too large to fit in memory, you can write the modified contents to a new file, once this process is complete, you can remove the original file and rename the new file to the original name.

Here is an example of this latter approach:

#include <stdio.h>

int main(void) {
    FILE *fp = fopen("sortie.txt", "r");
    if (fp == NULL) {
        perror("cannot open sortie.txt");
        return 1;
    }
    FILE *f2 = fopen("sortie-new.txt", "w");
    if (fp2 == NULL) {
        perror("cannot open sortie-new.txt");
        return 1;
    }
    int c, last = '\n';
    while ((c = getc(fp)) != EOF) { 
        if (c != '\n' || last != '\n')
            putc(c, fp2);
        last = c;
    }
    fclose(fp);
    if (fclose(fp2)) {
        perror("error writing to sortie-new.txt");
        remove("sortie-new.txt");
        return 1;
    }
    if (remove("sortie.txt")) {
        perror("cannot remove sortie.txt");
        return 1;
    }
    if (rename("sortie-new.txt", "sortie.txt")) {
        perror("cannot rename sortie-new.txt");
        return 1;
    }
    return 0;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • actually saw this idea when searching google but i preferred not to go for it because it needs double the space plus we didnt see deleting and renaming files in class but thanks anyway – YellowFlash Feb 25 '23 at 17:43
  • In actual practice, this will almost certainly not "use double the space". Even if you use the `fopen/fwrite/fclose` family, chances are good the files will never hit the physical disk and will just stay in the virtual file system the entire time. (If the files are big enough that this is actually a concern, you have a more difficult problem, and the solution to that problem is probably going to involve using disk to keep two copies of the files instead of trying to keep it all in memory, so you are back to this approach anyway.) – William Pursell Feb 25 '23 at 17:50
  • @WilliamPursell any way we didnt see in class the way to delete and rename files so i thought i shouldnt use them but i might use it as a last resort and hope the teacher doesnt read the code – YellowFlash Feb 25 '23 at 17:58
  • @YellowFlash: if have not seen file deletion and such, then read the file in memory, which may require realloc if there is no maximum size and write back to the same file open with "w". If the teacher insists on using "r+", you should be careful and consider attending a different class. – chqrlie Feb 25 '23 at 18:10