How do I describe the invariants for this simple algorithm in Dafny?

Question

Attempting to verify formally the following problem. https://leetcode.com/problems/count-equal-and-divisible-pairs-in-an-array/

Given a 0-indexed integer array nums of length n and an integer k, return the number of pairs (i, j) where 0 <= i < j < n, such that nums[i] == nums[j] and (i * j) is divisible by k.

function countPairs(nums: number[], k: number): number {
    let count = 0;
    for(let i = 0; i < nums.length-1; i++) {
        for(let j = i+1; j < nums.length; j++) {
            if(nums[i] == nums[j] && (i*j) % k == 0) {
                count++;
            }
        }
    }
    return count;
};

I cannot figure out how to verify this silly thing. Maybe I'm missing a super simple description of the problem and I'm too focused on irrelevant details.

My strategy started with trying to compare the result to the cardinality of a matching set. However, getting dafny to believe the cardinality of the set matches during the two loops seems impossible.

function method satPairs(nums: seq<nat>, k: nat, a: nat, b: nat): bool
    requires k > 0
    requires a <= b < |nums| 
{
    nums[a] == nums[b] && (a*b) % k == 0
}

function matchPairs(nums: seq<nat>, k: nat): nat 
    requires k > 0
{
    |set x,y | 0 <= x < y < |nums| && nums[x] == nums[y] && (x*y) % k == 0 :: (x,y)|
}

function method pairsI(nums: seq<nat>, k: nat, i: nat): set<(nat, nat)> 
    requires k > 0
    requires 0 <= i < |nums|
    ensures forall x,y :: 0 <= x < i && x <= y < |nums| && satPairs(nums, k, x, y) ==> (x,y) in pairsI(nums, k, i)
{
    set x: nat,y:nat | 0 <= x < i && x <= y < |nums| && satPairs(nums, k, x, y) :: (x,y)
}

I also tried to setup invariants based on counts using methods which could count the matching pairs using a column at a time. However, I ran into no end of incompatible conditions maintaining the invariants before and after the while loop. I was hoping these helper functions could be defined recursively allowing some induction on the result to be used inside the two loops for each invariant, but it didn't work.

function countSeqPairs(nums: seq<nat>, k: nat, start: nat,  stop: nat): nat 
    requires k > 0
    requires start <= stop <= |nums|
    decreases |nums|-start, |nums|-stop
{
    if start > stop || stop >= |nums| || start >= |nums| then 0 else 
    if stop < |nums| then (if satPairs(nums, k, start, stop) then 1 + countSeqPairs(nums, k, start, stop+1) else countSeqPairs(nums, k, start, stop+1)) else countSeqPairs(nums, k, start+1, stop+2)

}

function countSeqSlice(nums: seq<nat>, k: nat, start: nat, stop: nat): nat 
    requires k > 0
    requires start <= stop <= |nums|
    decreases |nums| - stop
{
    if start > stop || stop >= |nums| then 0
    else if satPairs(nums, k, start, stop) then 1 + countSeqSlice(nums, k, start, stop+1) else countSeqSlice(nums, k, start, stop+1)
}

Here is the main method, with various non-working attempts at invariants.

method countPairs(nums: seq<nat>, k: nat) returns (count: nat) 
    requires k > 0
    requires |nums| >= 2;
{
    count := 0;
    //ghost var cpairs: set<(nat, nat)> := {};
    for i : nat := 0 to |nums|-2 
        invariant count >= 0
        //invariant cpairs == pairsI(nums, k, i)
    {
        // ghost var occount := count;
        // ghost var increment := 0;
        for j : nat := i+1 to |nums|-1 
            invariant count >= 0
            // invariant count == occount + increment
            // invariant satPairs(nums, k, i, j) ==> increment == increment + 1 
            // invariant count == 0 || satPairs(nums, k, i, j) ==> count == count + 1
            //invariant cpairs == pairsI(nums, k, i) + set z: nat | i+1 <= z <= j && satPairs(nums, k, i, z) :: (i, z)
        {
            // ghost var currcount := count;
            // if nums[i] == nums[j] && (i*j)% k == 0 {
            if  i+1 <= j <= j && satPairs(nums, k, i, j) {
                // increment := increment + 1;
                //cpairs := {(i,j)}+cpairs;
                count := count + 1;
            }
        }
    }
}

It seems like there is no shorter description than the method itself for what is to be ensured. In addition to help with the above, I have three questions, generally how do you describe an invariant for something which is sort of manifestly arbitrary like this?

Secondly, what strategy can be used to handle loop invariants which are not true before the loop is run but are afterwards? It tends to be the initial condition is set to 0 or some other empty value, but then after the 0-th iteration of the loop it will be set to some value, and then the invariant fails. I keep running into this situation and it feels like there should be some sort of standard guard for it.

Finally, can or should ghost variables be used in method ensure statements?

You said you are trying to verify your solution to this problem, but your method has no postcondition. What exactly are you trying to prove? — James Wilcox, May 30 '22 at 19:32
That's a good question. Ideally, it seems like it should be `ensure count == matchPairs(nums, k)`. However, I just can't get it to work and I ended up looking any other post conditions that I could verify about this algorithm. — Hath995, May 30 '22 at 19:35
Thank, I appreciate it. I really want to learn how to verify significant programs. I have definitely made some progress on understanding, often thanks to your answers on SO. — Hath995, May 30 '22 at 19:53

James Wilcox · Accepted Answer · 2022-05-31T05:06:32.743

Here is one way to do it.

function method satPairs(nums: seq<nat>, k: nat, a: nat, b: nat): bool
    requires k > 0
    requires a <= b < |nums| 
{
    nums[a] == nums[b] && (a*b) % k == 0
}

function matchPairsHelper(nums: seq<nat>, k: nat, bound: int): set<(int, int)>
  requires k > 0
  requires bound <= |nums|
{
    set x,y | 0 <= x < bound && x < y < |nums| && satPairs(nums, k, x, y) :: (x,y)
}

function matchPairs(nums: seq<nat>, k: nat): set<(int, int)>
  requires k > 0
{
  matchPairsHelper(nums, k, |nums|)
}

function innerMatchPairsHelper(nums: seq<nat>, k: nat, outer: int, inner_bound: int): set<(int, int)>
  requires k > 0
  requires inner_bound <= |nums|
{
    set y | 0 <= outer < y < inner_bound && satPairs(nums, k, outer, y) :: (outer,y)
}

method countPairs(nums: seq<nat>, k: nat) returns (count: nat) 
    requires k > 0
    requires |nums| >= 2
    ensures count == |matchPairs(nums, k)|
{
  count := 0;
  for i : nat := 0 to |nums|
    invariant count == |matchPairsHelper(nums, k, i)|
  {
    for j : nat := i+1 to |nums|
      invariant count == |matchPairsHelper(nums, k, i)| + |innerMatchPairsHelper(nums, k, i, j)|
    {
      assert innerMatchPairsHelper(nums, k, i, j+1) ==
        (if satPairs(nums, k, i, j) then {(i, j)} else {}) + innerMatchPairsHelper(nums, k, i, j);
      if satPairs(nums, k, i, j) {
        count := count + 1;
      } 
    }
    assert matchPairsHelper(nums, k, i+1) == matchPairsHelper(nums, k, i) + innerMatchPairsHelper(nums, k, i, |nums|);
  }
}

A few notes:

James's rule of set comprehensions: never use a set comprehension as part of any larger expression, but only as the body of a function that returns that set. This facilitates referring to the set multiple times in assertions without confusing Dafny.
James's rule of cardinalities: Dafny will never prove two cardinalities equal. You must manually ask it to prove two sets equal, and from there, it will conclude the cardinalities are equal.
I didn't really look at your invariants. I just tried to write down what the loops are doing. The outer loop is a loop over the x coordinate. So the invariant is that all the "right" pairs have been counted for all x coordinates smaller than i (for all values of y). Then for the inner loop, the invariant is that all the pairs for the exact x coordinate i have been counted up to j. I define two functions for these notions.
Other than that, just need to assert some set equalities in a few places.

To your other questions:

how do you describe an invariant for something which is sort of manifestly arbitrary like this?

I'm not sure I understand the question. Are you asking "how do I discover the right loop invariants for this postcondition?" or "how do I state the postcondition when it seems like it is just as long as the code itself?"
what strategy can be used to handle loop invariants which are not true before the loop is run but are afterwards?

I didn't need this in my solution, but if you do need it then the easiest way is something like i == 0 || <invariant that is true only after loop starts>. Generally I consider such invariants to have a "bad smell" and try to refactor to avoid them. Sometimes they are unavoidable though.
can or should ghost variables be used in method ensure statements?

Not sure I understand this one either. You cannot refer to any local variables of a method in an ensures clause except those that are returned by the method. If needed you can return a ghost variable as in
```
method Foo() returns (bar: int, ghost baz: int)
  ensures ... mentions bar and baz just fine ...
```
Does that answer this one?

"how do I state the postcondition when it seems like it is just as long as the code itself?" is probably the best version of the question. Your answer about ghost variables makes sense. I guess I thought that maybe I couldn't describe the ensure condition without exposing more structure in the return using ghost variables, because the question was so simplistic. — Hath995, May 30 '22 at 20:12
yeah, when the spec is similar to the code there's not much you can do except try to verify a larger client application that uses the code as a library. then you hope that the client spec will be simpler. — James Wilcox, May 31 '22 at 05:06

How do I describe the invariants for this simple algorithm in Dafny?

1 Answers1