For a correct method, can Z3 find a model for the method's verification condition?
I had thought not, but here is an example where the method is correct
yet verification finds a model.
This was with Dafny 1.9.7.
For a correct method, can Z3 find a model for the method's verification condition?
I had thought not, but here is an example where the method is correct
yet verification finds a model.
This was with Dafny 1.9.7.
What Malte says is correct (and I found it nicely explained as well).
Dafny is sound, in the sense that it will only verify correct programs. In other words, if a program is incorrect, the Dafny verifier will never say that it is correct. However, the underlying decision problems are in general undecidable. Therefore, unavoidably, there will be cases where a program meets its specifications and the verifier still gives an error message. Indeed, in such cases, the verifier may even show a purported counterexample. It may be a false counterexample (as in the example above) -- it simply means that, as far as the verifier can tell, this is a counterexample. If the verifier just spent a little more time or if it was clever enough to unroll more function definitions, apply induction hypotheses, or do a host of other good-things-to-do, it may be possible to determine that the counterexample is bogus. So, any error message you get (including any counterexample that may accompany such an error message) should be interpreted as a possible error (and possible counterexample).
Similar situations frequently occur if you're trying to verify the correctness of a loop and you don't supply a strong enough loop invariant. The Dafny verifier may then show some values of variables on entry to the loop that can never occur in actuality. The counterexample is then trying to give you an idea of how to strengthen your loop invariant appropriately.
Finally, let me add two notes to what Malte said.
First, there's at least another source of incompleteness involved in this example, namely non-linear arithmetic. It can sometimes be difficult to navigate around.
Second, the trick of using function Dummy
can be simplified. It suffices (at least in this example) to mention the Pow
call, for example like this:
lemma EvenPowerLemma(a: int, b: nat)
requires Even(b)
ensures Pow(a, b) == Pow(a*a, b/2)
{
if b != 0 {
var dummy := Pow(a, b - 2);
}
}
Still, I like the other two manual proofs better, because they do a better job of explaining to the user what the proof is.
Rustan
Dafny fails to prove the lemma due to a combination of two possible sources of incompleteness: recursive definitions (here Pow
) and induction. The proof effectively fails because of too little information, i.e. because the problem is underconstrained, which in turn explains why a counterexample can be found.
Induction
Automating induction is difficult because it requires computing an induction hypothesis, which is not always possible. However, Dafny has some heuristics for applying induction (that might or might not work), and which can be switched of, as in the following code:
lemma {:induction false} EvenPowerLemma_manual(a: int, b: nat)
requires Even(b);
ensures Pow(a, b) == Pow(a*a, b/2);
{
if (b != 0) {
EvenPowerLemma_manual(a, b - 2);
}
}
With the heuristics switched off, you need to manually "call" the lemma, i.e. use the induction hypothesis (here, only in the case where b >= 2
), in order to get the proof through.
In your case, the heuristics were activated, but they were not "good enough" to get the proof done. I'll explain why next.
Recursive definitions
Reasoning statically about recursive definitions by unfolding them is prone to infinite descent because it is in general undecidable when to stop. Hence, Dafny per default unrolls function definitions only once. In your example, unrolling the definition of Pow
only once is not enough to get the induction heuristics to work because the induction hypothesis must be applied to Pow(a, b-2)
, which does not "appear" in the proof (since unrolling once only gets you to Pow(a, b - 1)
). Explicitly mentioning Pow(a, b-2)
in the proof, even in a otherwise meaningless formula, triggers the induction heuristics, however:
function Dummy(a: int): bool
{ true }
lemma EvenPowerLemma(a: int, b: nat)
requires Even(b);
ensures Pow(a, b) == Pow(a*a, b/2);
{
if (b != 0) {
assert Dummy(Pow(a, b - 2));
}
}
The Dummy
function is there to make sure that the assertion provides no information beyond syntactically including Pow(a, b-2)
. A less oddly-looking assertion would be assert Pow(a, b) == a * a * Pow(a, b - 2)
.
Calculational Proof
FYI: You can also make the proof steps explicit and have Dafny check them:
lemma {:induction false} EvenPowerLemma_manual(a: int, b: nat)
requires Even(b);
ensures Pow(a, b) == Pow(a*a, b/2);
{
if (b != 0) {
calc {
Pow(a, b);
== a * Pow(a, b - 1);
== a * a * Pow(a, b - 2);
== {EvenPowerLemma_manual(a, b - 2);}
a * a * Pow(a*a, (b-2)/2);
== Pow(a*a, (b-2)/2 + 1);
== Pow(a*a, b/2);
}
}
}