0

It's been 1 week since I started trying to find a solution to this problem, and I ended up reading about CYK algorithm, but I can't understand how it would help me.

So I have a certain string from which I start, let's call it startString.
And I have a certain string to which I want to get by applying the later-explained rules, called stopString.

-----------

Now let's take an example:

startString = "A"  
stopString = "2403"

This example's rules are the following:

A->BC  
B->D  
B->ED  
C->F  
C->FB  


E->0  
D->2  
D->3  
F->4  
E->1  

The program will take the above input and output the minimal list of transformations applied to get from startString to stopString, which COULD BE the following:

A->BC, B->D, C->FB, B->ED, D->2, F->4, E->0, D->3

-----------

My question is: How does CYK help me here? How do I obtain "2403" from "A" using CYK? Is there any simpler solution to this problem?

  • CYK is the hallmark solution for this problem. – erip Nov 22 '16 at 11:30
  • Don't think about advanced parsing algorithms. Just write a parser. I'd start by rewriting the rules (they're simple) to rules with iteration. – Cheers and hth. - Alf Nov 22 '16 at 11:32
  • Yeah, that much I know already. But how do I implement it ? All I could find about CYK was the matrix algorithm of checking whether a string is obtainable using a grammar. What is this grammar? And how do I obtain it from my rules? – ijustpostedsomethingdumb Nov 22 '16 at 11:33
  • Your "rules" are the grammar....Have you checked the wikipedia page on [CYK](https://en.wikipedia.org/wiki/CYK_algorithm)? – Simon Kraemer Nov 22 '16 at 11:34
  • You need to write your grammar in Chomsky Normal Form before you can use CYK, though. – erip Nov 22 '16 at 11:35
  • For example, B→D and B→ED together mean that a B is a possible E followed by a D, B→{E}D. That's easier to parse (for me). So that parse of the initial 2 is D, by your terminal rule, and that in turn is a B. Which in turn is part of an A or C. – Cheers and hth. - Alf Nov 22 '16 at 11:36
  • @erip But it's pretty obvious it CAN be generated: A->BC, B->D, C->FB, B->ED, D->2, F->4, E->0, D->3 – ijustpostedsomethingdumb Nov 22 '16 at 11:49
  • @ijustpostedsomethingdumb Perhaps my grammar is wrong. I'll check. – erip Nov 22 '16 at 11:52

1 Answers1

0

I'll start by writing your grammar in Chomsky Normal Form. There are currently two rules which aren't satisfying this property; namely

B->D
C->F

A naive way to fix this is to decay them into three rules:

B->2 
B->3 
C->4

Now your production rules are

A->BC  
B->ED  
B->2 
B->3 
C->4
C->FB  
D->2  
D->3  
E->0  
E->1 
F->4  

The solver I'm using (here) doesn't allow numbers as terminals. Thus, I'll just create a bijective mapping

0 <-> a
1 <-> b
2 <-> c
3 <-> d
4 <-> e 

The new target string is "cead" instead of "2403". Plugging this into my solver yields the production rules.

Enter the start Variable A

Number of productions 11
A->BC
B->d
B->ED
C->e
C->FB
E->a
D->c
D->d
F->e
E->b
B->c

Enter string to be checked : cead
    A
          C
    A           B
   DB    CF     E    BD
String can be generated

Unfortunately, the rules are a little obscured because of the naive fix; however, I think this will suffice.

erip
  • 16,374
  • 11
  • 66
  • 121
  • 1
    Yes, this seems about right. But how do I implement CNF-transformation? – ijustpostedsomethingdumb Nov 22 '16 at 12:20
  • [Here](https://people.cs.clemson.edu/~goddard/texts/theoryOfComputation/9a.pdf) are some slides that cover this topic. – erip Nov 22 '16 at 12:21
  • Also, is there a way of avoiding CNF-transformation as it modifies the "rules" that I need to print? – ijustpostedsomethingdumb Nov 22 '16 at 12:22
  • @ijustpostedsomethingdumb You can write a different parser. CYK is the (typically) most efficient for parsing CFGs. You can write a recursive-descent parser, too, or a myriad of others. Recursive-descent is the easiest IMO. – erip Nov 22 '16 at 12:23