1

I'm currently reading this edition of Compiler: Principles, Techniques & Tools.

I've never coded in C before (Although I've dabbled in C++) but from other programming knowledge most of the code makes sense, however I noticed a quirk which is that the functions are defined as such:

emit(t, tval)
    int t, tval
{
}

Figuring something was amiss, I looked it up and sure enough, that method of defining functions seems to be obsolete. I was hoping that someone could perhaps read the bits of code which I retyped below and warn me of any bad practices or obsolete techniques which I otherwise might not notice and pick up on. Also, any heads up on newer features of C which may help me write the code neater would be greatly appreciated. I am primarily looking for code which takes advantage of semantics, features, functions, etc that are no longer on the standard C specification, rather than from a stylistic point of view.

Also, if you have a copy of the book, and wouldn't mind flicking through it and seeing (or even from memory) if you can spot some other obsolete or redundant methods of doing things, that'd be fantastic too!

I wasn't sure whether this would fit better in CodeReview, if so, please comment and I'll delete and repost there, but I think since it's from a popular programming text, it might be more fitting here. Apologies if I'm incorrect.

global.h

#include <stdio.h>
#include <ctype.h>

#define BSIZE   128
#define NONE    -1
#define EOS     '\0'

#define NUM     256
#define DIV     257
#define MOD     258
#define ID      259
#define DONE    260

int tokenval;
int lineno;

struct entry {
    char *lexptr;
    int token;
};

struct entry symtable[];

lexer.c

#include "global.h"

char lexbuf[BSIZE];
int lineno = 1;
int tokenval = NONE;

int lexan()
{
    int t;

    while (1) {
        t = getchar();

        if (t == ' ' || t == '\t')
            ;
        else if (t == '\n')
            lineno++;
        else if (isdigit(t)) {
            ungetc(t, stdin);
            scanf("%d", &tokenval);
            return NUM;
        }
        else if (isalpha(t)) {
            int p, b = 0;
            while (isalnum(t)) {
                lexbuf[b] = t;
                b++;
                if (b >= BSIZE)
                    error("compiler error");
            }
            lexbuf[b] = EOS;
            if (t != EOF)
                ungetc(t, stdin);
            p = lookup(lexbuf);
            if (p == 0)
                p = insert(lexbuf, ID);
            tokenval = p;
            return symtable[p].token
        }
        else if (t == EOF)
            return DONE;
    }
}

parser.c

#include "global.h"

int lookahead;

parse()
{
    lookahead = lexan();
    while (lookahead != DONE) {
        expr(); match(';');
    }
}

expr()
{
    int t;
    term();
    while (1)
        switch (lookahead) {
            case '+': case '-':
                t = lookahead;
                match(lookahead); term(); emit(t, NONE);
                continue;
            default:
                return;
        }
}

term()
{
    int t;
    factor();
    while (1)
        switch (lookahead) {
            case '*': case '/':
                t = lookahead;
                match(lookahead); factor(); emit(t, NONE);
                continue;
            default:
                return;
        }
}

factor()
{
    switch (lookahead) {
        case '(':
            match('('); expr(); match(')'); break;
        case NUM:
            emit(NUM, tokenval); match(NUM); break;
        case ID:
            emit(ID, tokenval); match(ID); break;
        default:
            error("Syntax error");
    }
}

match (t)
    int t;
{
    if (lookahead == t)
        lookahead = lexan();
    else error("Syntax error");
}
Ashley Davies
  • 1,873
  • 1
  • 23
  • 42

2 Answers2

2

From just skimming over the code, other than the old-style function argument declarations you've already brought up, the only other outdated feature I see is the lack of return type on functions that don't return anything. For example, the parse function returns nothing, which this old code denotes by declaring it as just parse(), yet modern code would require void parse(). There's also the issue that functions that don't take any arguments should have void between their parentheses (e.g., void parse(void)), but I don't think this is strictly required.

jwodder
  • 54,758
  • 12
  • 108
  • 124
  • Ah; thanks! I think I had a bit of a brain twitch typing without a void return type, but ignored it. Probably wouldn't have ever realised that it's a good idea to put void inside the parenthesis too - Thanks for that :) – Ashley Davies Mar 27 '14 at 23:38
1

Some unordered things:
Global variables (tokenval, lineno...) are bad style.

if (t == ' ' || t == '\t')
    ;

Only my opinion, but far more readable:

if (t == ' ' || t == '\t') {}

There are some function calls which can fail
but do not have error checking (at least scanf, maybe more)

return symtable[p].token

This shouldn´t compile, missing ;

Omitting return types like parse() is alo bad style.
And things like

match (t)
    int t;

should be

match (int t)

(again, return type is missing too)

And maybe i´m get downvoted for being stupid, but:
Dynamic-sized array definitions like char lexbuf[BSIZE]; ...
with all the different standards, i lost track where it is allowed and where not,
but if you want to be sure that you can compile it anywhere,
allocate it yourself (pointer, malloc, free)

deviantfan
  • 11,268
  • 3
  • 32
  • 49
  • 1
    `BSIZE` is a preprocessor directive defined as 128 in `global.h`, so `char lexbuf[BSIZE];` should work in just about every version of C, even those from before variadic arrays were introduced in C99. – jwodder Mar 27 '14 at 23:38
  • Ah, the missing ; was me mistyping it. Sorry. Thanks for being thorough - Very helpful. Since there's no objects in C (to my knowledge, at least), what would the alternative to global variables be, if that doesn't sound *too* ignorant? – Ashley Davies Mar 27 '14 at 23:40
  • @AshleyDavies: Passing everything around as function parameter / return values. It will be more complicated, but glob. variables have several downsides. And, about the objects: There are no classes, but you can have structs to pack some variables together. Like a class without any methods (or inheritance or...) – deviantfan Mar 27 '14 at 23:44