0

I'm writing a simple recursive descent parser that takes standard input and counts the number of 'a' and 'b' characters. The grammar for this is as follows:

S -> A B '\n'

A -> a A | empty

B -> b B | empty

int lookahead; 

int nontermA();
int nontermB();

void match (int terminal){

   if (lookahead == terminal){
      lookahead = getchar();
      } else {
      printf("Syntax error at %c\n", lookahead);
      exit(0);
      }

}

void nontermS(){
   int a,b;
   switch(lookahead){      
      default:
      a = nontermA();
      b = nontermB();
      printf("Match! Number of A's is %d and number of B's is %d", a,b);
      match('\n');
     }

}

int nontermA(){
   int countA = 0;
   switch(lookahead){

      case 'a': match('a'); countA++; nontermA(); break; 
      case 'A': match('A'); countA++; nontermA(); break;
      default: break;   

   }

   return countA;

}

int nontermB(){
   int countB = 0;
   switch(lookahead){
      case 'b': match('b'); countB++; nontermB(); break;
      case 'B': match('B'); countB++; nontermB(); break;
      default: break;
   }
   return countB;

}

Basically, if I type in something like "aA", "bB", or "abAB", it should just output the number of a's and b's, but the actual output for my program is just 1 for a and b. I also get a syntax error when I type in "ba", as well as if I input "B".

  • 1
    Please provide [A Minimal, Complete, and Verifiable Example (MCVE)](http://stackoverflow.com/help/mcve). Where is `lookahead` set before the first call to `match`?, etc... – David C. Rankin Sep 09 '19 at 02:54

1 Answers1

2

Both of the nontermA and nontermB functions exhibit the same logical flaw. The following describes only nontermA's bug, but the same bug also occurs in nontermB.

 int countA = 0;

This declares a new int variable that's local to nontermA.

  case 'a': match('a'); countA++; nontermA(); break; 

This increments countA, and recursively invokes itself.

However each recursive invocation of nontermA works exactly like any other function call to nontermA: it creates a new local variable named countA and initializes it to 0.

The obvious intent here is for the counter to persist across all recursive invocations. But recursion does not work this way.

Each recursive call creates a new countA local int variable for that recursive function call only, and that's the only thing that gets incremented. That's how C++ works.

The solution is very simple: just return the counter value, have both nontermA and nontermB return the counter value directly.

  case 'a': match('a'); return nontermA()+1;

And, otherwise,

return 0;

when no match occurs.

Sam Varshavchik
  • 114,536
  • 5
  • 94
  • 148
  • It works! It also fixed the bug where it would throw a syntax error at 'B'. Still running into the weird syntax error wen I input 'ba', however. I think that's the fault of either nontermS() or match(). – Ben Nutter Sep 09 '19 at 03:22
  • As defined, the grammar only accepts zero or more `a`s, followed by zero or more `b`s. Your syntax error is consistent with the grammar. – Sam Varshavchik Sep 09 '19 at 11:01