I have to write a function that checks whether input strings are valid for a given language specification. I thought that this would be a standard CFG -> Chomsky Normal Form, then CYK parsing, but one of the rules in the language is preventing this from happening.
Some of the rules are straightforward, if we define terminals {a,b,c,d,e,f,P,Q,R,S}
, then valid strings are
1) Any of the lowercase terminals in isolation
2) If 'x' is a valid string, then so is Sx
But the third rule is
3) If X and Y are valid input strings, then so are PXY, QXY, RXY
where {P,Q,R}
are the remaining uppercase terminals and X and Y are nonterminals.
What would the production rule for this look like? I thought it would be something like
XY -> PXY (and QXY, RXY)
but there are two problems with this. The first is that this is not a CFG rule -- does that mean this language defines a CSG instead?
And the second is that this doesn't work, because
XY -> PXY -> PPXY
would not be a valid message in all cases.