I have a an HTML table code, which needs to be converted into plain text, using the Flex utility in Linux systems.
I've come up with a list of tokens in my .lex file, which are as follows:
OPENTABLE <table>
CLOSETABLE </table>
OPENROW <tr>
CLOSEROW </tr>
OPENHEADING <th>
CLOSEHEADING </th>
OPENDATA <td>
CLOSEDATA </td>
STRING [0-9a-zA-Z]*
%%
%%
My CGF (Translation Scheme included) for the HTML parse looks like:
TABLE --> OPENTABLE ROWLIST CLOSETABLE ;
ROWLIST --> ROWLIST ROW | ^ ;
ROW --> OPENROW DATALIST CLOSEROW printf("\n");
DATALIST --> DATALIST DATA | ^ ;
DATA --> OPENDATA STRIN CLOSEDATA printf(yytext+"\t");
I've seen some examples, but I'm not getting what should I write in the rules section of my .lex file.