0

Constructing a recursive descent parser to parse the following grammar. Is there a way i could return without passing anything back for all the epsilon productions(e)? (considering this approach I'm taking of parsing as shown in the code)

E = TG
G = +TG | e
T = FH
H = *FH | e
F = (E) | id

#include <stdio.h>

char* next;

int terminal(char);
int E();int G();int G1();int G2();int T();int H();int H1();int H2();int F();int F1();int F2();

int main(int argc, char const *argv[])
{
    char str[10];
    printf("Enter an expression to be parsed : ");
    scanf("%s", str);
    next = &str[0]
    (*next == '\0' && E() == 1) ? printf("Parsed Successfully\n") : printf("Parsed Unsuccessfully\n");
    return 0;
}

int terminal(char token){return *next++ == token;}
int E(){return T() && G();}
int G(){char* temp = next; return (next = temp, G1()) || (next = temp, G2());}
int G1(){return terminal('+') && T() && G();}
int G2(){return;}   //ERROR : non-void function should return a value
int T(){return F() && H();}
int H(){char* temp = next; return (next = temp, H1()) || (next = temp, H2());}
int H1(){return terminal('*') && F() && H();}
int H2(){return;}   //ERROR : on-void function should return a value
int F(){char* temp = next; return (next = temp, F1()) || (next = temp, F2());}
int F1(){return terminal('(') && E() && terminal(')');}
int F2(){return terminal('a');}
mrdoubtful
  • 429
  • 2
  • 7
  • 17
  • I really don't understand how this code should work .. `*next` is never initialized. –  Jul 15 '15 at 20:03
  • thx for pointing that out...just initialized next – mrdoubtful Jul 15 '15 at 20:05
  • 1
    Adding a few more comments without completely understanding what you try to do: `*next == '\0'` will short-curcuit to `0` (false) it the user entered anything. Get rid of that `scanf` in production code, use `fgets(str, 10, stdin)`. And generally for parsing an *epsilon* ... just return `1` (true)? –  Jul 15 '15 at 20:09
  • eventually, parsing would fail if the user enters a random value though using gets could be a better option and returning 1 might help – mrdoubtful Jul 15 '15 at 20:16
  • Another observation is your `next` pointer never moves.... [*edit*]: Just a quick shot, how about using a state machine (looping) instead of recursion? –  Jul 15 '15 at 20:16
  • oh yes...sorry abt tat...havent run the program yet...was confused on how to skip epsilon..updated the next pointer – mrdoubtful Jul 15 '15 at 20:20
  • 1
    To explain the short-curcuit thing: if the left hand side of && evaluates to 0, the expression is 0 without evaluating the right hand side. So something to start with would be changing `(*next == '\0' && E() == 1)` to `(E() && !*next)` [*edit* or better yet `(E() && *next != 0 && *next != '\n' && *next != '\r')` if you use `fgets` as advisable to avoid buffer overflows] –  Jul 15 '15 at 20:23
  • hey thx a lot! the code seems to be working with returning 1 and yes will keep in mind abt the short circuiting as well – mrdoubtful Jul 15 '15 at 20:31
  • And as a final note: You commonly can't describe a grammar in C and expect it to work as a parser, because there are too many conditions (end of line, end of input) to be recognized at each step. See this [state machine parser](https://github.com/Zirias/llad/blob/master/src/config.c#L222) I wrote lately. I guess a better alternative would be using tools like `flex` and `bison` and I should learn them myself ;) –  Jul 15 '15 at 20:34
  • cool! will check it out! :) – mrdoubtful Jul 15 '15 at 20:37
  • sorry, my comment above was logically wrong ... you want to make sure you reached the end of input, so it would be the other way around: `(E() && (!*next || *next == '\r' || *next == '\n'))` -- of course you would have to check for these on every production, too ... that's what makes it complicated, you don't have to "only" deal with the grammar but also with such things –  Jul 15 '15 at 20:46
  • next is a pointer to the expression inputed(str). Hence if str was successfully parsed next would point to '\0' so shouldn't it be (E() && (*next == '\0') – mrdoubtful Jul 15 '15 at 20:54
  • In theory, yes, but you should never use `scanf("%s", str)` -- it has the same problem as `gets(str)` -- it will read happily beyond your buffer and (in the best case) make your program crash or (in the worst case) allow intruders to do anything they want. –  Jul 15 '15 at 21:07
  • Will keep tat in mind :) – mrdoubtful Jul 15 '15 at 21:13

1 Answers1

0

The code is unreadable. But the logic shows that you should return a boolean. An empty production always succeeds. So it should return true. e.g.,

 int H2(){return true;} 

Naturally, it would make sense to inline such trivial productions.

Laurent Michel
  • 395
  • 1
  • 5