Z3: how to improve performance?

Question

In following Python code:

from itertools import product
from z3 import *

def get_sp(v0, v1):
    res = sum([v0[i] * v1[i] for i in range(len(v0))])
    return res

def get_is_mod_partition(numbers):
    n = len(numbers)
    mod_power = 2**n
    for signs in product((1, -1), repeat = len(numbers)):
        if get_sp(numbers, signs) % mod_power == 0:
            return 1
    return 0

def check_sat(numbers):
    n = len(numbers)
    s = Solver()
    signs = [Int("s" + str(i)) for i in range(n)]
    for i in range(n):
        s.add(Or(signs[i] == -1, signs[i] == 1))
    s.add(get_sp(numbers, signs) % 2**n == 0)
    print(s.check())

l = [1509, 1245, 425, 2684, 3630, 435, 875, 2286, 1886, 1205, 518, 1372]
check_sat(l)
get_is_mod_partition(l)

check_sat takes 22 seconds and get_is_mod_partition - 24 millseconds. I haven't expected such results from "high-performance theorem prover". Is there way to drastically improve performance?

*IMHO*, the way the problem is encoded, the solver is basically trying his own luck against an exponential number of assignments (in the worst case) wrt. the numbers length. Perhaps, adding an explicit `[-1, 1]` interval bound to each `signs[i]` might help the solver taking a decision during early pruning calls and thus grant a faster convergence to the satisfiable assignment. An alternative is to make `signs[i]` Boolean and write the sum as `sum = (ITE signs[i] +N[i] -N[i]) ...`, but I have no idea how that performs in `z3`. — Patrick Trentin, Feb 12 '18 at 20:20
How to make interval [-1, 1] and exclude 0? With AND signs[i] != 0? — DSblizzard, Feb 12 '18 at 20:32
For each `sign[i]` I would write: **1.** `-1 <= sign[i] <= 1` to help early pruning calls **2.** `(<= sign[i] -1) or (<= 1 sign[i])` and `not(<= sign[i] -1) or not(<= 1 sign[i])` to *Boolean*-ize the assignment of values on `sign[i]` so that it does not need to wait for *ILP* solver to be activated before finding inconsistent assignments at the `Boolean` level. *[The latter point is a replacement for `Or(signs[i] == -1, signs[i] == 1)`]*. — Patrick Trentin, Feb 12 '18 at 20:38

alias · Accepted Answer · 2018-02-14T21:49:27.787

Following Patrick's advice, you can code it as follows:

from z3 import *

def get_sp(v0, v1):
    res = sum([If(v1[i], v0[i], -v0[i]) for i in range(len(v0))])
    return res

def check_sat(numbers):
    n = len(numbers)
    s = Solver()
    signs = [Bool("s" + str(i)) for i in range(n)]
    s.add(get_sp(numbers, signs) % 2**n == 0)
    print(s.check())
    m = s.model()
    mod_power = 2 ** n
    print ("("),
    for (n, sgn) in zip (numbers, signs):
        if m[sgn]:
           print ("+ %d" % n),
        else:
           print ("- %d" % n),
    print (") %% %d == 0" % mod_power)

l = [1509, 1245, 425, 2684, 3630, 435, 875, 2286, 1886, 1205, 518, 1372]
check_sat(l)

Runs in about 0.14 seconds on my machine, and prints:

sat
( - 1509 - 1245 - 425 + 2684 + 3630 + 435 - 875 + 2286 - 1886 - 1205 - 518 - 1372 ) % 4096 == 0

However, as Patrick commented, it is not clear why this version is significantly faster than the original. I wanted to do some benchmarking, and did so using Haskell as I'm more familar with that language and its Z3 bindings:

import Data.SBV
import Criterion.Main

ls :: [SInteger]
ls = [1509, 1245, 425, 2684, 3630, 435, 875, 2286, 1886, 1205, 518, 1372]

original = do bs <- mapM (const free_) ls
              let inside b = constrain $ b .== 1 ||| b .== -1
              mapM_ inside bs
              return $ sum [b*l | (b, l) <- zip bs ls] `sMod` (2^length ls) .== 0

boolOnly = do bs <- mapM (const free_) ls
              return $ sum [ite b l (-l) | (b, l) <- zip bs ls] `sMod` (2^length ls) .== 0

main = defaultMain [ bgroup "problem" [ bench "orig" $ nfIO (sat original)
                                      , bench "bool" $ nfIO (sat boolOnly)
                                      ]
                   ]

And indeed, the bool-only version is about 8 times faster:

benchmarking problem/orig
time                 810.1 ms   (763.4 ms .. 854.7 ms)
                     0.999 R²   (NaN R² .. 1.000 R²)
mean                 808.4 ms   (802.2 ms .. 813.6 ms)
std dev              8.189 ms   (0.0 s .. 8.949 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking problem/bool
time                 108.2 ms   (104.4 ms .. 113.5 ms)
                     0.997 R²   (0.992 R² .. 1.000 R²)
mean                 109.3 ms   (107.3 ms .. 111.5 ms)
std dev              3.408 ms   (2.261 ms .. 4.843 ms)

Two observations:

Haskell bindings are way faster than those of Python! About an order of magnitude
The bool-only version is about another order of magnitude faster compared to integers

For the former, it might be interesting to find out why Python bindings are performing so poorly; or simply switch to Haskell :-)

Further analysis, and a cute trick

It appears the issue is with the call to mod. In the Haskell translation, the system internally gives names to all intermediate expressions; which seems to make z3 go fast. Python bindings, however, translate expressions more wholesale, and inspection of the generated code (you can see it by looking at s.sexpr()) reveals it does not name internal expressions. And when mod is involved, I'm guessing the solver's heuristics fail to recognize the essential linearity of the problem and end up spending a lot of time.

To improve the time, you can do the following simple trick. The original says:

s.add(get_sp(numbers, signs) % 2**n == 0)

Instead, make sure the sum gets an explicit name. That is, replace the above line by:

ssum = Int("ssum")
s.add (ssum == get_sp(numbers, signs))
s.add (ssum % 2**n == 0)

You'll see that this also makes the Python version run much faster.

I still would prefer the boolean translation, but this one provides a nice rule of thumb: Try to name intermediate expressions, which gives the solver nice choice points as guided by the semantics of the formula; as opposed to one bulky output. As I mentioned, the Haskell bindings don't suffer from this, as it internally converts all formulas to simple three-operand SSA form, which allows the solver to more easily apply heuristics.

I can't see the *non-linear* part of the formula.. I thought it was just a problem of delayed detection of simple conflict assignments. Good suggestion, +1 — Patrick Trentin, Feb 12 '18 at 23:15
@PatrickTrentin Quite right. See my edited answer. Seems to be an issue with the Python interpretative layer, perhaps? — alias, Feb 13 '18 at 02:28
@PatrickTrentin I did some benchmarking and even with the Haskell bindings, the bool-only is about an order-of-magnitude faster; see my updated answer. Python bindings seem to indeed induce a huge penalty, but even with pure Z3 (say as coded in SMTLib), there is a significant difference. Must have something to do with the internal communication between SAT and integer engines. Might make a good "performance" test case that Z3-developers might find useful. — alias, Feb 13 '18 at 17:13
did you happen to try the approach I proposed in my second comment as well? How does that perform? I suspect it should be closer to the ite-based/bool-only approach than to the int-based one. — Patrick Trentin, Feb 13 '18 at 17:43
@PatrickTrentin See my further analysis in the edited answer. If one gives a name to the intermediate sum before the mod, then the Python version goes fast as well. — alias, Feb 14 '18 at 21:52
i can't explain this, though.. how exactly spends a lot of time..? Doing what? — Patrick Trentin, Feb 14 '18 at 21:55
That's really a question for the developers really. One guess is that since multiplications are happening inside a call to modulus operator, they are treated all as non-linear. But this is a pure guess. Perhaps a z3 ticket can help. — alias, Feb 14 '18 at 22:14

Z3: how to improve performance?

1 Answers1

Further analysis, and a cute trick