-4

I have two questions related to the titular exercise, from The C Programming Language. I'm sure that they've both been answered before, so either a direct answer or a link to a previous post (I couldn't find any) would be appreciated.

The exercise itself is to write a C program that removes comments from C code.

  1. I've seen lots of examples of this program, but I can't figure out how to test it. They all use getchar() to "acquire" the code that they're going to edit, but I can't figure out how to tell the program to read another file, rather than to just wait for input from the command line. I tried "./a.out program_to_edit.c", but that didn't work. Alternatively, if there's an easy way to create a string out of blocks of text (rather than one character at a time) like in other languages, that would work too.

  2. This question is a bit more general. I'm confused about how escape characters work when reading C source code with getchar(). If I'm viewing a .c file in TextEdit, I see "\t", but if I compiled it and printed that out, it would come out as a tab character. Does that mean that the .c file contains '\\' and 't' and the compiler combines them, or is it something else entirely? What will getchar() return if I use it to read through that file?

Thanks.

2 Answers2

4

For the first part of your question, you read a file something like this. You should find examples of this in "The Book."

#include <stdio.h>

int main(void)
{
    FILE *fp = fopen("Some_file.txt","rt");
    if (fp != NULL)
    {
        int c = fgetc(fp);
        while (c != EOF)
        {
            /* Do something with c */
        }

        fclose(fp);
    }
    else
    {
        printf("Can't open the file?\n");
    }

    return 0;
}

For the second part of your question, the backslash is an indicator that the backslash and the next characters get replaced by something.

  • /t gets replaced by (char) 9, which is the ascii tab character.
  • /a gets replaced by (char) 7, which is an audiable bell
  • /n gets replaced by (char) 10 on unix systems, and (char)13, (char) 10 on dos systems.

Do some rereading. It's in "The Book."

Welcome to Stack Overflow.

EvilTeach
  • 28,120
  • 21
  • 85
  • 141
  • 2
    Thanks for the comprehensive answer. Oddly enough, the book doesn't cover file inputs before this exercise is assigned. I didn't think to read ahead, but good idea. As for part two, I wasn't clear enough. I know what escape characters are, I just don't know whether getchar() reading a "\t" out of a .c file would return an escaped backslash and then return a "t", or if it would return an actual tab character. – Conner Hansen Jun 21 '15 at 05:06
  • ah. From a file it would be two characters. Read the backslash then the t. – EvilTeach Jun 21 '15 at 13:40
0
  1. Redirect standard input: ./a.out < program_to_edit.
  2. Escaped representations exist only in source code. Compiler converts them to the appropriate characters.
dlask
  • 8,776
  • 1
  • 26
  • 30
  • Thanks. To clarify question two, what I meant to say was that I'm confused by the fact that you can have actual newlines and tabs and such in a .c file, but you can also have two-character escape sequences that will be converted into them upon compilation. If you're reading through a .c file with getchar(), an actual newline will obviously return a newline, but will an escaped backslash followed by an 'n' do the same? – Conner Hansen Jun 21 '15 at 05:15
  • The sequences like '\' and 'n' exist only to allow you to write special characters in a comfortable way. The two possible representations `'\n'` and `10` are treated equally by the compiler. – dlask Jun 21 '15 at 05:25
  • Nevermind, answered my own question. getchar() reads them as '\\' followed by 'n'. – Conner Hansen Jun 21 '15 at 05:25
  • Definitely not. `getchar()` returns the value 10. See my previous comment. – dlask Jun 21 '15 at 05:27
  • I just tested it, and I'm pretty sure it does. What I'm saying is that if you pass a file that contains the text "printf("hi\n");" to a program that prints out the file character by character using getchar(), it will print out the "\n", rather than replacing the two characters with the return character that printf would generate if you called the method itself. – Conner Hansen Jun 21 '15 at 05:32
  • We are mixing two different things. When your input contains backslash then `getchar` returns backslash. When your input contains `'n'` then `getchar` returns `'n'`. On the other hand, when your *source code* contains `'\n'` the compiler treats it as a character whose code is 10. – dlask Jun 21 '15 at 06:16
  • Yeah, sorry for the confusion. Anyways, thanks for your help, everything is all cleared up now. – Conner Hansen Jun 21 '15 at 06:48