I've got a lot of true/false results saved as bits in long[]
arrays. I do have a huge number of these (millions and millions of longs).
For example, say I have only five results, I'd have:
+----- condition 5 is true
|
|+---- condition 4 is false
||
||+--- condition 3 is true
|||
|||+-- condition 2 is true
||||
||||+- condition 1 is false
10110
I also do have a few trees representing statements like:
condition1 AND (condition2 OR (condition3 AND condition 4))
The trees are very simple but very long. They basically look like this (it's an oversimplification below, just to show what I've got):
class Node {
int operator();
List<Node> nodes;
int conditionNumber();
}
Basically either the Node is a leaf and then has a condition number (matching one of the bit in the long[] arrays) or the Node is not a leaf and hence refers several subnodes.
They're simple yet they allow to express complicated boolean expressions. It works great.
So far so good, everything is working great. However I do have a problem: I need to evaluate a LOT of expressions, determining if they're true or false. Basically I need to do some brute-force computation for a problem for which there's no know better solution than brute-forcing.
So I need to walk the tree and answer either true
or false
depending on the content of the tree and the content of the long[]
.
The method I need to optimize looks like this:
boolean solve( Node node, long[] trueorfalse ) {
...
}
where on the first call the node
is the root node and then, obviously, subnodes (being recursive, that solve
method calls itself).
Knowing that I'll only have a few trees (maybe up to a hundred or so) but millions and millions of long[]
to check, what steps can I take to optimize this?
The obvious recursive solution passes parameters (the (sub)tree and the long[], I could get rid of the long[]
by not passing it as a parameter) and is quite slow with all the recursive calls etc. I need to check which operator is used (AND or OR or NOT etc.) and there's quite a lot of if/else or switch statements involved.
I'm not looking for another algorithm (there aren't) so I'm not looking for going from O(x) to O(y) where y would be smaller than x.
What I'm looking for is "times x" speedup: if I can write code performing 5x faster, then I'll have a 5x speedup and that's it and I'd be very happy with it.
The only enhancement I see as of now --and I think it would be a huge "times x" speedup compared to what I have now-- would be to generate bytecode for every tree and have the logic for every tree hardcoded into a class. It should work well because I'll only ever have a hundred or so trees (but the trees aren't fixed: I cannot know in advance how the trees are going to look like, otherwise it would be trivial to simply hardcode manually every tree).
Any idea besides generating bytecode for every tree?
Now if I want to try the bytecode generation route, how should I go about it?