Is Drools suitable for writing rules for Stemming and/or POS tagging ? Suggestions for a better rule-language are welcome. I read many papers in this field that use the rule-based approach but none of them mentioned what library or framework was used to write the rules.
My rules are like the following:
if (length = 3 & first_letter in group1 and second_letter in group2) then ...
if (length = 3 & first_letter in group1 and second_letter not_in group2) then ...
if (length = 3 & first_letter not_in group1 and second_letter in group2) then ...
if (length = 3 & first_letter not_in group1 and second_letter not_in group2) then ...
if (length = 4...
... and so on.
The problem is that these rules are too many to handle. Imagine that there are ten letter-groups, and that there is a case for each letter belonging to each group. I could easily have over a thousand rules to classify a word correctly. I wrote 30 of those rules in plain C# code and that was enough for me to see how inefficient this approach was. I already have my rules organized as a tree on paper. I just need the right framework to insert, represent, tweak, and test them.
I hope my question is clear. Thank you.