I have an app that includes a 3 operator (& | !) boolean expression evaluator, with variables and constants. Generally the expressions aren't too long (perhaps 50 terms at the most, but usually a lot less). There can be very many expressions - I'm expecting the upper limit to be around a million. Currently I have a hand written parser with a very simple evaluator that simply recursively traverses the parse tree. One constraint is that this has to be callable from C++. I have no sharing between expressions. I'd like to investigate speeding this up.
I see two avenues of research.
- Add sharing and store the state indicating whether an expression node has been evaluated or not.
- Extract Common Subexpressions.
Also I would expect that a code generation approach will be faster than an interpretive approach working on parse trees or similar structures. It would probably be fairly straightforward to generate some C++ code, but considering the length of the functions, I don't know if a compiler like GCC will be able to optimize the CSEs.
I've seen that there are a few libraries available for expression evaluation, but in my work environment adding 3rd party libraries is not simple plus they all seem very complicated compared to my needs.
Lastly I've been looking at Antlr4 a bit recently, so that might be appropriate for me. In the past I've worked on C code generation, but I have no experience of using something like LLVM for optimisation and code generation.
Any suggestions for which way to go?