0

I have the following regular expression:

(12)*[34]+(5[67])*

How can I convert the expression to a grammar that is left recursive?

I came up with this. I did not follow any methods or anything...

G --> GAB34C
A --> A12 | epsilon
B --> B34 | epsilon
C --> C56 | C57 | epsilon
markalex
  • 8,623
  • 2
  • 7
  • 32
kiwi kiwi
  • 21
  • 1

1 Answers1

2

As commented, 34 is not a correct rule as 3 and 4 are alternates. This not only affects the second rule, but also the first.

Not a problem, but C56 | C57 can be written as C5(6|7)

So:

G → A(3|4)BC
A → A12 | ε
B → B(3|4) | ε
C → C5(6|7) | ε

If G must also employ left recursion itself, then put the C-recursion inside G (eliminate C):

G → G5(6|7) | A(3|4)B
A → A12 | ε
B → B(3|4) | ε

And maybe it is more elegant to keep (3|4) in the definition of B:

G → G5(6|7) | AB
A → A12 | ε
B → B(3|4) | (3|4)

Script

Here is an implementation with generators in JavaScript. It produces some random strings in the grammar:

const coin = () => Math.random() < 0.5;

function* G() {
    if (coin()) {
        yield* G();
        yield 5;
        yield coin() ? 6 : 7;
    } else {
        yield* A();
        yield* B();
    }
}

function* A() {
    if (coin()) {
        yield* A();
        yield 1;
        yield 2;
    }
}

function* B() {
    if (coin()) {
        yield* B();
        yield coin() ? 3 : 4;
    } else {
        yield coin() ? 3 : 4;
    }
}

// Produce some random strings
for (let i = 0; i < 10; i++) console.log(...G());
trincot
  • 317,000
  • 35
  • 244
  • 286
  • If am not wrong, G never terminates? So should it be like this then? G → GA(3|4)BC|3|4 A → A12 | ε B → B(3|4) | ε C → C5(6|7) | ε – kiwi kiwi Apr 24 '23 at 16:56
  • @kiwi, you're right; It looks like G should not have a recursive G in its definition. Even `G→GA(3|4)BC|3|4` is problematic as after C there should never follow a 1,2,3 or 4. – trincot Apr 24 '23 at 19:18
  • If a do like this it should be correct? And this is still left recursive even though there is no recursion in G? G → A(3|4)BC A → A12 | ε B → B(3|4) | ε C → C5(6|7) | ε – kiwi kiwi Apr 24 '23 at 19:24
  • I think this is not right... G → G12 | BC B → B(3|4) | (3|4) C → C5(6|7) | ε Because G12 can only stop when we use BC infront of G12. Hence we can get 312 – kiwi kiwi Apr 24 '23 at 21:46
  • Right, I mistakenly had chosen A as suffix after the recurrent G. Should be C. See update. – trincot Apr 25 '23 at 06:10