I am currently constructing a symbolic execution engine for a small subset of Python. The most complicated data structure supported by this subset are arbitrarily nested tuples, i.e., you can write something like x = (1, 2, (3, 4), 5)
. In my SE engine, all values are represented as z3 objects. Yesterday, I had a quite hard time trying to model these nested tuples in z3.
What I tried:
- Arrays. Problem: Arrays are, well, arrays, i.e., they span a rectangular space, which is not suitable for my tuples.
- Sequences. I like the sequences data type; however, it does not support nesting. For instance, you cannot write
z3.Concat(z3.Unit(z3.IntVal(1)), z3.Unit(z3.Concat(z3.Unit(z3.IntVal(2)), z3.Unit(z3.IntVal(3)))))
, which raisesz3.z3types.Z3Exception: b"Sort of function 'seq.++' does not match the declared type. Given domain: (Seq Int) (Seq (Seq Int))"
- Lists. However, Lists have a single type
T
, which I can only instantiate toList
or, e.g.,Int
. As far as I know, there is not something like union types or a general super sort in z3. - Uninterpreted functions. I think it should be possible to introduce a function
tuple
with sortTuple
, and even overload it for different argument lengths and sorts. However, I don't know how to extract an element of atuple(1, 2, 3)
, since that expression is not recursively defined.
I would appreciate any help from z3 / SMT experts around here!
Thanks a lot in advance!
Tuple datatypes: Problem with "abstract" indexes & tuples
I also tried the idea suggested by @alias, defining a datatype for tuples. This works quite well, but there is one problem: How can I model the access to a tuple element if either the tuple or the index are not concrete, i.e., are (expressions containing) variables?
I can define, for instance, a 2-tuple of ints as:
Tup_II = z3.Datatype('Tuple_II')
Tup_II.declare('tuple', ('fst', z3.IntSort()), ('snd', z3.IntSort()))
Tup_II = Tup_II.create()
a_tuple = Tup_II.tuple(z3.IntVal(1), z3.IntVal(2))
print(a_tuple) # tuple(1, 2)
tuple_concrete_access = z3.simplify(Tup_II.fst(a_tuple))
print(tuple_concrete_access) # 1
That's all fine. I can also define nested tuples on top by embedding the Tup_II datatype:
Tup_IT = z3.Datatype('Tuple_IT')
Tup_IT.declare('tuple', ('fst', z3.IntSort()), ('snd', Tup_II))
Tup_IT = Tup_IT.create()
another_tuple = Tup_IT.tuple(z3.IntVal(0), a_tuple)
print(another_tuple) # tuple(0, tuple(1, 2))
But to get access to an element, I need to know whether the index is 0 or 1 to choose the right accessor (fst
, snd
).
I tried to get inspiration from how sequence types behave:
int_seq = z3.Concat(z3.Unit(z3.IntVal(1)), z3.Unit(z3.IntVal(2)))
print(int_seq) # Concat(Unit(1), Unit(2))
concrete_access = z3.simplify(int_seq[z3.IntVal(0)])
print(concrete_access) # 1
concrete_access_2 = z3.simplify(int_seq[z3.IntVal(2)])
x = z3.Int("x")
abstract_access = z3.simplify(int_seq[x])
print(abstract_access)
# If(And(x >= 0, Not(2 <= x)),
# seq.nth_i(Concat(Unit(1), Unit(2)), x),
# seq.nth_u(Concat(Unit(1), Unit(2)), x))
So one idea is to define a Tuple_II.nth
function. However, if we have a tuple like Tup_IT consisting of elements of different types, how do I define the target domain of this function? For example,
target_sort = # ???
tup_IT_nth = z3.Function("Tuple_IT.nth", z3.IntSort(), Tup_II, target_sort)
So for that, I'd need some kind of super sort of int
and Tup_II
: The same problem as for lists.
Any ideas? :)
What I'd like to have: Utility function for creating tuple types
Just assume that I can solve the sort problem for the getter function; then I wrote a nice utility function creating all the things you need for dealing with tuples in presence of abstract indexes:
def create_tuple_type(*sorts: z3.SortRef) -> \
Tuple[z3.Datatype, Dict[int, z3.FuncDeclRef], z3.FuncDeclRef, z3.BoolRef]:
"""
DOES NOT YET WORK WITH NESTED TUPLES!
Example:
>>> tuple_II, accessors, getter, axioms = create_tuple_type(z3.IntSort(), z3.IntSort())
>>>
>>> a_tuple = tuple_II.tuple(z3.IntVal(1), z3.IntVal(2))
>>>
>>> print(z3.simplify(accessors[0](a_tuple))) # 1
>>> print(z3.simplify(getter(a_tuple, z3.IntVal(0)))) # Tuple_II.nth(tuple(1, 2), 0)
>>>
>>> s = z3.Solver()
>>> s.set("timeout", 1000)
>>> s.add(z3.Not(z3.Implies(axioms, z3.IntVal(1) == getter(a_tuple, z3.IntVal(0)))))
>>> assert s.check() == z3.unsat # proved!
>>>
>>> s = z3.Solver()
>>> s.set("timeout", 1000)
>>> s.add(z3.Not(z3.Implies(axioms, z3.IntVal(0) == getter(a_tuple, z3.IntVal(0)))))
>>> assert s.check() == z3.unknown # not proved!
:param sorts: The argument sorts for the tuple type
:return: The new tuple type along with
accessor functions,
a generic accessor function,
and axioms for the generic accessor
"""
dt_name = "Tuple_" + "".join([str(sort)[0] for sort in sorts])
datatype = z3.Datatype(dt_name)
datatype.declare('tuple', *{f"get_{i}": sort for i, sort in enumerate(sorts)}.items())
datatype = datatype.create()
accessors = {i: getattr(datatype, f"get_{i}") for i in range(len(sorts))}
target_sort = z3.IntSort() # ??? <-- What to do here?
get = z3.Function(f"{dt_name}.nth", datatype, z3.IntSort(), target_sort)
get_in_range = z3.Function(f"{dt_name}.nth_i", datatype, z3.IntSort(), target_sort)
get_not_in_range = z3.Function(f"{dt_name}.nth_u", datatype, z3.IntSort(), target_sort)
x = z3.Int("x")
t = z3.Const("t", datatype)
axiom_1 = z3.ForAll(
[t, x],
get(t, x) == z3.If(
z3.And(x >= z3.IntVal(0), x < z3.IntVal(len(sorts))),
get_in_range(t, x),
get_not_in_range(t, x)
)
)
axiom_2 = None
for idx in range(len(sorts)):
axiom = get_in_range(t, z3.IntVal(idx)) == accessors[idx](t)
if axiom_2 is None:
axiom_2 = axiom
continue
axiom_2 = z3.And(axiom_2, axiom)
axiom_2 = z3.ForAll([t], axiom_2)
return datatype, accessors, get, z3.And(axiom_1, axiom_2)
The problem is the declaration of target_sort
with the # ??? <-- What to do here?
comment.