7

I'm trying to implement a very very fast boolean expression engine. I'm using it to represent states in very large state spaces, so I need it to handle as many operations per second as possible. At the very base of this engine is a sum of products. I am running up against an issue optimizing the NOT operator though. For example, if I have a sum of products with N minterms where each minterm has around M variables, then trying to invert that would create M^N minterms which would then be simplified using the espresso algorithm. I can speed it up a little and save some memory if I run the espresso algorithm intermittently during the inverse operation, but that's not enough. I doubt I am the first person to run into this problem, and I have tried doing the research, but I can't seem to find an efficient way to do this.

Can anybody point me in the right direction?

Ned Bingham
  • 2,749
  • 18
  • 23
  • 3
    In general you cannot avoid the exponential blowup. – n. m. could be an AI Jul 06 '14 at 09:37
  • 2
    No, Boolean expressions are by their nature exponential, but you can reduce it along the way to try to minimize the exponential blowup. From what I can tell, the not operator has a lot of pattern, leading me to believe that using espresso would be like using a 2 ton tungsten rod accelerated from space to hammer in a nail – Ned Bingham Jul 06 '14 at 09:51
  • 5
    No, you cannot always reduce it. Try `(x1&y1)|(x2&y2)|...|(xn&yn)`. After negation this has length of 2^n. – n. m. could be an AI Jul 06 '14 at 09:57
  • 1
    If you have `n` variables, then there are `2^n` entries in the truth table of whatever function you write with that `n` variables. I'm pretty sure if you **restrict** yourself to only expressing the function in sum-of-product form, then will always be some assignment which the sum-of-product form necessarily need to have `2^n` minterms. **However, what if you don't restrict yourself to sum-of-product form?** Maybe this could lead to a better alternative. – Apiwat Chantawibul Jul 06 '14 at 10:05
  • 1
    Yes, that is a worst case scenario. But what about the average case? On average if you have M variables and N minterms, the final result will not have M^N minterms because many minterms will have common variables. If you can efficiently take advantage of this during the operation instead of waiting until the end, your average case might not be exponential – Ned Bingham Jul 06 '14 at 10:12
  • 1
    I've looked at multiple different representations, and the sum of products representation seems to be the most efficient representation I can implement. I can cram a 16 variable minterm into one 32 bit integer and with some bit twiddling, most of the desired operations are quite fast. BDDs are problematic because they are inherently recursive which is slow, and because their ordering matters. I might be able to somehow use both a sum of products and a product of sums representation, but the interaction between the two gets very difficult. – Ned Bingham Jul 06 '14 at 10:25
  • There is also the possibility of using a fully hierarchical representation (allow parenthesis), and that would mitigate the exponential nature of the not operator, it would also add a small amount of recursion, and it's not clear how efficient simplification would be, but it's likely that minterms will only have a few variables which makes for a lot of wasted space given my current minterm representation. – Ned Bingham Jul 06 '14 at 10:42
  • @Ned If we talk about the average case of a random assignment to truth table of `n` variables, then we can do an rigorous analysis on that. But if you talk about average case of boolean function in practice, then I don't know. So, which case do you want to explore? – Apiwat Chantawibul Jul 06 '14 at 10:57
  • @Billiska Lets just go with a random assignment. Its easier to analyze. – Ned Bingham Jul 06 '14 at 16:54
  • I was thinking about performing a factoring step before trying to take the inverse. The factoring step doesn't have to be perfect, it just has to be good enough to reduce the largest exponent. Any thoughts? http://embedded.eecs.berkeley.edu/eecsx44/fall2011/lectures/BooleanBasicsLogicOptimization-2.pdf – Ned Bingham Jul 06 '14 at 19:08

2 Answers2

3

So, its been 5 years since I posted this question. After recently rediscovering it, I realized that I committed the cardinal sin. At some point between then and now I found a fairly fast algorithm to get this done and never came back to answer the question. The problem is that I've lost all associated documentation. Welp... here it is. I'll update this answer if I rediscover the source.

https://github.com/nbingham1/boolean/blob/a0f21eb1808dbcf86a3360ea85ab4eae15f5bf49/boolean/cover.cpp#L1055

EDIT: found the source

Multiple-Valued Logic Minimization For PLA Synthesis by Richard L. Rudell, page 58

https://apps.dtic.mil/dtic/tr/fulltext/u2/a606736.pdf

This uses Generalized Shannon Expansion, recursively complementing the two sides of the expansion and merging the complements with a simplifying heuristic.

Ned Bingham
  • 2,749
  • 18
  • 23
0

You can make it in O(n+m) where

answer = ( x1 OR x2 OR .. xn ) AND ( y1 OR y2 OR .. ym )

But you can optimize the process to find out if the final answer is not going to be 1

answer = ( x1 OR x2 OR .. xn ) LOGICAL-AND ( y1 OR y2 OR .. ym )

Where LOGICAL-AND will check if the current value is 0, it will return 0 in O(n+1)

You can also change this process into set operation

DEFINE X = { X1, X2, .. Xn }
DEFINE Y = { Y1, Y2, .. Ym }

ANSWER =  X ∈ 1  AND  Y ∈ 1

And optimize it like this

IF X ∈ 1
THEN RETURN Y ∈ 1
ELSE RETURN 0

On average, you get Time = i + j where

i = position of left-most 1 in X
j = position of left-most 1 in Y 

The worst cases that would take O(n+m)

000..001, 000..000

000..001, 000..001
Khaled.K
  • 5,828
  • 1
  • 33
  • 51
  • It's been four years since you posted this, but I've only recently come back to this. This looks like you are suggesting that I keep the inverted result in conjunctive normal form (CNF)? Assuming that this operator is part of some larger algorithm, keeping the result in CNF just moves the complexity to boolean operators that would have to interface CNF and DNF. – Ned Bingham May 20 '19 at 01:38