0

I am trying to prove the following property in Dafny: covariance(x,y) <= variance(x) * variance(y), where x and y are real.

First of all, I would like to know if I am interpreting covariance and variance correctly. Thus, I make the following question: is it the same proving covariance(x,y) <= variance(x) * variance(y) and proving covariance(x,y) <= covariance(x,x) * covariance(y,y)? I make this question because the variance usually appears to the power of 2 (i.e., sigma^2(x)), which confuses me.

However, I understand from https://en.wikipedia.org/wiki/Covariance that this equivalence holds. Concretely, the following is stated: The variance is a special case of the covariance in which the two variables are identical.

I also take one of the definitions of covariance from that link. Concretely: cov(x,y)=(1/n)*summation(i from 1 to n){(x_i-mean(x))*(y_i-mean(y))}.

Thus, assuming that I must prove covariance(x,y) <= covariance(x,x) * covariance(y,y) and that covariance is defined like above, I test the following lemma in Dafny:

lemma covariance_equality (x: seq<real>, y:seq<real>)
    requires |x| > 1 && |y| > 1; //my conditions
    requires |x| == |y|; //my conditions
    requires forall elemX :: elemX in x ==> 0.0 < elemX; //my conditions
    requires forall elemY :: elemY in y ==> 0.0 < elemY; //my conditions
    ensures covariance_fun (x,y) <= covariance_fun(x,x) * covariance_fun (y,y);
    {}

Which cannot be proven automatically. The point is that I do not know how to prove it manually. I made this (not useful) approach of a contraction proof, but I guess this is not the way:

lemma covariance_equality (x: seq<real>, y:seq<real>)
    requires |x| > 1 && |y| > 1; //my conditions
    requires |x| == |y|; //my conditions
    requires forall elemX :: elemX in x ==> 0.0 < elemX; //my conditions
    requires forall elemY :: elemY in y ==> 0.0 < elemY; //my conditions
    ensures covariance_fun (x,y) <= covariance_fun(x,x) * covariance_fun (y,y);
    {

    if !(covariance_fun(x,y) < covariance_fun(x,c) * covariance_fun (y,y)) {

      //...
      assert !(forall elemX' :: elemX' in x ==> 0.0<elemX') && (forall elemY' :: elemY' in y ==> 0.0<elemY'); //If we assume this, the proof is verified
      assert false;
    } 

}

My main question is: how can I prove this lemma?

So that you have all the information, I offer the functions by which covariance calculation is made up. First, the covariance function is as follows:

function method covariance_fun(x:seq<real>, y:seq<real>):real
    requires |x| > 1 && |y| > 1;
    requires |x| == |y|;
    decreases |x|,|y|;
{
    summation(x,mean_fun(x),y,mean_fun(y)) / ((|x|-1) as real)
}

As you can see, it performs summation and then divides by the length. Also, note that the summation function receives mean of x and mean of y. Summation is as follows:

function method summation(x:seq<real>,x_mean:real,y:seq<real>,y_mean:real):real
  requires |x| == |y|;
  decreases |x|,|y|;
{
if x == [] then 0.0 
else ((x[0] - x_mean)*(y[0] - y_mean)) 
    + summation(x[1..],x_mean,y[1..],y_mean)
}

As you can see, it recursively performs (x-mean(x))*(y-mean(y)) of each position.

As for helper functions, mean_fun calculates the mean and is as follows:

function method mean_fun(s:seq<real>):real
  requires |s| >= 1;
  decreases |s|;
{
    sum_seq(s) / (|s| as real)
}

Note that is uses sum_seq function which, given a sequence, calculates the sum of all its positions. It is defined recursively:

function method sum_seq(s:seq<real>):real
  decreases s;
{
if s == [] then 0.0 else s[0] + sum_seq(s[1..])
}

Now, with all these ingredientes, can anyone tell me (1) if I already did something wrong and/or (2) how to prove the lemma?

PS: I have been looking and this lemma seems to be the Cauchy–Schwarz inequality, but still do not know how to solve it.

Elaborating on the answer

I will (1) define a convariance function cov in terms of product and then (2) use this definition in a new lemma that should hold.

Also, note that I add the restriction that 'x' and 'y' are non-empty, but you can modify this

//calculates the mean of a sequence
function method mean_fun(s:seq<real>):real
  requires |s| >= 1;
  decreases |s|;
{
    sum_seq(s) / (|s| as real)
}

//from x it constructs a=[x[0]-x_mean, x[1]-x_mean...]
function construct_list (x:seq<real>, m:real) : (a:seq<real>)
  requires |x| >= 1
  ensures |x|==|a|
  {
    if |x| == 1 then [x[0]-m]
    else [x[0]-m] + construct_list(x[1..],m)
  }

//covariance is calculated in terms of product
function cov(x: seq<real>, y: seq<real>) : real
  requires |x| == |y|
  requires |x| >= 1
  ensures cov(x,y) == product(construct_list(x, mean_fun(x)),construct_list(y, mean_fun(y))) / (|x| as real) //if we do '(|x| as real)-1', then we can have a division by zero
  //i.e., ensures cov(x,y) == product(a,b) / (|x| as real)
{
  //if |a| == 1 then 0.0 //we do base case in '|a|==1' instead of '|a|==1', because we have precondition '|a| >= 1'
  //else 
  var x_mean := mean_fun(x);
  var y_mean := mean_fun(y);

  var a := construct_list(x, x_mean);
  var b := construct_list(y, y_mean);

  product(a,b) / (|a| as real)
}

//Now, test that Cauchy holds for Cov
lemma cauchyInequality_usingCov (x: seq<real>, y: seq<real>)
   requires |x| == |y|
   requires |x| >= 1
   requires forall i :: 0 <= i < |x| ==> x[i] >= 0.0 && y[i] >= 0.0
   ensures square(cov(x,y)) <= cov(x,x)*cov(y,y)
{
   CauchyInequality(x,y);
}
\\IT DOES NOT HOLD!

Proving HelperLemma (not provable?)

Trying to prove HelperLemma I find that Dafny is able to prove that LHS of a*a*x + b*b*y >= 2.0*a*b*z is positive.

Then, I started to prove different cases (I know this is not the best procedure):

(1) If one of a,b or z is negative (while the rest positive) or when the three are negative; then RHS is negative And when RHS is negative, LHS is greater or equal (since LHS is positive).

(2) If a or b or z are 0, then RHS is 0, thus LHS is greater or equal.

(3) If a>=1, b>=1 and z>=1, I tried to prove this using sqrt_ineq(). BUT! We can find a COUNTER EXAMPLE. With a=1, b=1, z=1, RHS is 2; now, choose x to be 0 and y to be 0 (i.e., LHS=0). Since 0<2, this is a counter example.

Am I right? If not, how can I prove this lemma?


lemma HelperLemma(a: real, x: real, b: real, y: real, z: real)
  requires x >= 0.0 && y >= 0.0 //&& z >= 0.0
  requires square(z) <= x * y
  ensures a*a*x + b*b*y >= 2.0*a*b*z
  {
    assert square(z) == z*z;
    assert a*a*x + b*b*y >= 0.0; //a*a*x + b*b*y is always positive

    if ((a<0.0 && b>=0.0 && z>=0.0) || (a>=0.0 && b<0.0 && z>=0.0) || (a>=0.0 && b>=0.0 && z<0.0) || (a<0.0 && b<0.0 && z<0.0)) {
      assert 2.0*a*b*z <=0.0; //2.0*a*b*z <= 0.0 is negative or 0 when product is negative
      assert a*a*x + b*b*y >= 2.0*a*b*z; // When 2.0*a*b*z <= 0.0 is negative or 0, a*a*x + b*b*y is necessarily greater
    }

    if (a==0.0 || b==0.0 || z==0.0){
      assert a*a*x + b*b*y >= 2.0*a*b*z;
    }

    else if (a>=1.0 && b>=1.0 && z>= 1.0){
      assert a*a >= a;
      assert b*b >= b;

      assert a*a+b*b >= a*b;

      assert square(z) <= x * y;
      //sqrt_ineq(square(z),x*y);
      //assert square(z) == z*z;
      //sqrt_square(z);
      //assert sqrt(z) <= sqrt(x*y);
      //assert (x==0.0 || y == 0.0) ==> (x*y==0.0) ==> (square(z) <= 0.0) ==> z<=0.0;
      assert a*a*x + b*b*y >= 2.0*a*b*z;
      //INDEED: COUNTER EXAMPLE: a=1, b=1, z=1. Thus: RHS is 2; now, choose x to be 0 and y to be 0 (i.e., LHS=0). Since 0<2, this is a counter example.
    }
  }

Theo Deep
  • 666
  • 4
  • 15

1 Answers1

1

Edited after clarification/fixing of post condition.

Here is attempt to do it in Dafny using calculation and induction. Few comments on verification

Since task at hand is verification, I assumed sqrt function and some of its properties. HelperLemma is provable provided we assume some properties of sqrt function. There are two assumption in verification that is easy to prove.

function square(r: real) : real
 ensures square(r) >=  0.0
{
  r * r
}

function sqrt(r: real) : real

function {:fuel 2} product(a: seq<real>, b: seq<real>) : real
  requires |a| == |b|
{
  if |a| == 0 then 0.0
  else a[0] * b[0] + product(a[1..], b[1..])
}

lemma sqrt_ineq(a: real, b: real)
  requires a >= 0.0 && b >= 0.0
  requires a <= b
  ensures sqrt(a) <= sqrt(b)

lemma sqrt_square(a: real)
  ensures sqrt(square(a)) == a

lemma CauchyBaseInequality(a1: real, a2: real, b1: real, b2: real)
  ensures square(a1*b1 + a2*b2) <= (a1*a1 + a2*a2) * (b1*b1 + b2*b2)
{
  calc <= {
    square(a1*b1 + a2*b2) - (a1*a1 + a2*a2) * (b1*b1 + b2*b2);
    a1*b1*a1*b1 + a2*b2*a2*b2 + 2.0*a1*b1*a2*b2 - a1*a1*b1*b1 - a1*a1*b2*b2 - a2*a2*b1*b1 - a2*a2*b2*b2;
    2.0*a1*b1*a2*b2 - a1*a1*b2*b2 - a2*a2*b1*b1;
    - square(a1*b2 - a2*b1);
    0.0;
  }
}

lemma HelperLemma(a: real, x: real, b: real, y: real, z: real)
  requires x >= 0.0 && y >= 0.0
  requires square(z) <= x * y
  ensures a*a*x + b*b*y >= 2.0*a*b*z

lemma CauchyInequality(a: seq<real>, b: seq<real>)
  requires |a| == |b|
  ensures square(product(a, b)) <= product(a, a) * product(b, b)
{
  if |a| <= 2 {
     if |a| == 2 {
       CauchyBaseInequality(a[0], a[1], b[0], b[1]);
     }
  }
  else {
    var x, xs := a[0], a[1..];
    var y, ys := b[0], b[1..];
    calc >= {
      product(a, a) * product(b, b) - square((product(a, b)));
      (x*x + product(xs, xs))*(y*y + product(ys, ys)) - square(x*y + product(xs, ys));
      x*x*y*y + y*y*product(xs, xs) + x*x*product(ys, ys) + product(xs, xs)*product(ys, ys)
        - x*x*y*y - square(product(xs, ys)) - 2.0*x*y*product(xs, ys);
      y*y*product(xs, xs) + x*x*product(ys, ys) + product(xs, xs) * product(ys, ys)
        - square(product(xs, ys)) - 2.0*x*y*product(xs, ys);
      { CauchyInequality(xs, ys); }
      y*y*product(xs, xs) + x*x*product(ys, ys) - 2.0*x*y*product(xs, ys);
      {
        CauchyInequality(xs, ys);
        assume product(xs, xs) >= 0.0;
        assume product(ys, ys) >= 0.0;
        HelperLemma(y, product(xs, xs), x, product(ys, ys), product(xs, ys));
      }
      0.0;
    }
  }
}

lemma cauchyInequality_usingCov (x: seq<real>, y: seq<real>)
   requires |x| == |y|
   requires |x| >= 1
   requires forall i :: 0 <= i < |x| ==> x[i] >= 0.0 && y[i] >= 0.0
   ensures square(cov(x,y)) <= cov(x,x)*cov(y,y)
{
   var x_mean := mean_fun(x);
   var y_mean := mean_fun(y);
    
   var a := construct_list(x, x_mean);
   var b := construct_list(y, y_mean);
   assert cov(x, y) == product(a, b) / (|x| as real);
   assert cov(x, x) == product(a, a) / (|x| as real);
   assert cov(y, y) == product(b, b) / (|x| as real);
   CauchyInequality(a, b);
    
 }

Verification of Helper Lemma

function square(r: real): real
  ensures square(r) >= 0.0
{
   r * r
}

function abs(a: real) : real
  ensures abs(a) >= 0.0
{
  if a < 0.0 then -a else a
}

function sqrt(r: real): real
requires r >= 0.0
ensures square(sqrt(r)) == r  && sqrt(r) >= 0.0
  
lemma sqrtIneq(a: real, b: real)
  requires a >= 0.0 && b >= 0.0
  requires a <= b
  ensures sqrt(a) <= sqrt(b)

lemma sqrtLemma(a: real)
   ensures sqrt(square(a)) == abs(a)

lemma sqrtMultiply(a: real, b: real)
  requires a >= 0.0 && b >= 0.0
  ensures sqrt(a*b) == sqrt(a) * sqrt(b);

lemma differenceLemma(a: real, b: real)
   requires a >= abs(b) 
   ensures a - b >= 0.0 && a + b >= 0.0
{
}

lemma HelperLemma(a: real, x: real, b: real, y: real, z: real)
  requires x >= 0.0 && y >= 0.0
  requires square(z) <= x * y
  ensures a*a*x + b*b*y >= 2.0*a*b*z
{
   
   var sx := sqrt(x);
   var sy := sqrt(y);

   assert sx * sx == x;
   assert sy * sy == y;
   if (a >= 0.0 && b >= 0.0) || (a < 0.0 && b < 0.0) {
    calc >= {
      a*a*x + b*b*y - 2.0*a*b*z;
      a*a*sx*sx + b*b*sy*sy - 2.0*a*b*sx*sy + 2.0*a*b*sx*sy - 2.0*a*b*z;
      square(a*sx - b*sy) + 2.0*a*b*sx*sy - 2.0*a*b*z;
      2.0*a*b*sx*sy - 2.0*a*b*z;
      2.0*a*b*(sx*sy - z);
      {
        sqrtIneq(square(z), x * y);
        assert sqrt(x * y) >= sqrt(square(z));
        sqrtMultiply(x, y);
        assert sx * sy >= sqrt(square(z));
        sqrtLemma(z);
        assert sx * sy >= abs(z);
        differenceLemma(sx*sy, z);
        assert sx*sy >= z;
      }
      0.0;
    }
   }
   else {
      calc >= {
         a*a*x + b*b*y - 2.0*a*b*z;
         a*a*sx*sx + b*b*sy*sy + 2.0*a*b*sx*sy - 2.0*a*b*sx*sy - 2.0*a*b*z;
         square(a*sx + b*sy) - 2.0*a*b*sx*sy - 2.0*a*b*z;
         - 2.0*a*b*sx*sy - 2.0*a*b*z;
         - 2.0*a*b*(sx*sy + z);
         {
           sqrtIneq(square(z), x * y);
           assert sqrt(x * y) >= sqrt(square(z));
           sqrtMultiply(x, y);
           assert sx * sy >= sqrt(square(z));
           sqrtLemma(z);
           assert sx * sy >= abs(z);
           differenceLemma(sx*sy, z);
           assert sx*sy >= -z;
         }
         0.0;
       }
   }
}
Divyanshu Ranjan
  • 1,015
  • 3
  • 7
  • Hm.. how is that possible? I must be missing some conditions, as you are saying. That should hold, according to the definition of https://en.wikipedia.org/wiki/Cauchy%E2%80%93Schwarz_inequality in 'R n-dimensional space'.. – Theo Deep Oct 21 '22 at 19:03
  • Well, now that I look again at this link, it seems that it should be `pow(covariance_fun (x,y),2) <= covariance_fun(x,x) * covariance_fun (y,y)`, do you see the same? Note that the `pow()` part is `covariance_fun (x,y)*covariance_fun (x,y)`. Thanks for the answer! – Theo Deep Oct 21 '22 at 19:04
  • Yes, I tested ensures `(covariance_fun (x,y)*covariance_fun (x,y)) <= covariance_fun(x,x) * covariance_fun (y,y);` and does not hold either. – Theo Deep Oct 21 '22 at 19:08
  • This looks right. I have modified my answer to show initial attempt in proving it. But I fear dafny/z3 is not suited for non linear arithmetic over real numbers verification. – Divyanshu Ranjan Oct 22 '22 at 11:24
  • Hi, thanks a lot for the response! However, i am trying to follow it and there is lots of code missing :( (1) I guess that method `square` receives `a` and returns `a*a`; (2) I guess that `product` is my `covariance_fun`; (3) I guess `sqrt` calculates square root, but I do not know how to implement this (I only find an example for naturals). And besides that, do you mean that if we `assume` those `assert` of the calculation (i.e., we prove those helper lemmas such as `sqrtLemma`) the property will hold? How do you know that? I need more info! :) Thanks again!! – Theo Deep Oct 22 '22 at 22:32
  • Thanks again! I am actively researching this code, since I cannot understand it completely. Will come back. The big problem I am encountering is implementation: i.e., we can surely prove this symbolically (i.e., with a symbolic definition of the properties of the square root function), but actual implementation of actual square root needs approximation methods. I will be asking more about this and we will be able to reach a response. Thanks a lot, these contributions help a lot. – Theo Deep Oct 24 '22 at 16:04
  • 1
    Yes, you are right. sqrt root implementation will be approximation methods. Since whole point of this exercise is prove certain properties not run the program, then it seems appropriate to assume existence of sqrt with some properties. – Divyanshu Ranjan Oct 24 '22 at 17:34
  • Totally agree. However, what I understand is that the `assume` and the `HelperLemma` are provable using Dafny, right? – Theo Deep Oct 24 '22 at 17:39
  • 1
    That's it. By the way: there are provers that handle these kind of problems (sinus, sqrt..) up to a certain point: e.g., MetiTarski and dReal. Thanks again! – Theo Deep Oct 24 '22 at 17:44
  • 1
    HelperLemma is provable assuming sqrt_ineq and for every positive real number sqrt exsists – Divyanshu Ranjan Oct 24 '22 at 17:44
  • I am reading the code again and I think I am not understanding your solution. For instance, the 'covariance' (I think it is 'product') does not use any 'mean' calculation. And why does it have a fuel? It should unroll the formula at least until the length of lists, should it not? Help :( – Theo Deep Oct 24 '22 at 21:10
  • 1
    As per your observation, original inequality is isomorphic to Cauchy Inequality. To see, do this transformation, a = x - and b = y - , <> to represent seq with repeated element. Now first two assume still hold but not third, but this one is not required for proof. Fuel is required for case |a| == 2, where we need a * b == a[0]*b[0] + a[1]*b[1]. No, it will not unroll until length but according to definition, i.e. = a*b = a[0]*b[0] + a[1..]*b[1..], now unrolling of a[1..] * b[1..] will not happen due to insufficient fuel – Divyanshu Ranjan Oct 25 '22 at 05:59
  • I think I do not understand, sorry :( My question is simple: is `product` the `covariance` function? I tested it for the variance case (i.e., `product(x,x)`) and is not. An example is the input `a=[2,2]`: `covariance` would return `0` (the correct answer), whereas `product` would return `8`. `Covariance(a,a)` is `summation(a,a)/(|a|-1)`, where `summation(a,a)` does `|a|=0 then 0.0` (as you do) and `else then (a[0]-mean(a))^2+summation(a[1..])` (this is the different part). – Theo Deep Oct 25 '22 at 11:47
  • I know I am the one missing something, since in Coq they proved Cauchy-Schwarz also using your definition of `product` (see second answer in https://stackoverflow.com/questions/67033055/cauchy-schwartz-inequality-in-coq). But I need to use typical covariance definition (i.e., the one with summation) so that I can use it in other executable code. Let us see if I catch what I am missing.. – Theo Deep Oct 25 '22 at 11:47
  • `product` is not same as `covariance` but `covariance` can be expressed in terms of `product`. `covariance(x, x) = product(a, a) / |a|-1` where `a = [x[0] - x_mean, x[1] - x_mean, ...]`. Now whatever you set out to prove originally, can be proved using CauchyInequality with covariance expressed as product. – Divyanshu Ranjan Oct 25 '22 at 11:59
  • I see you can express covariance in terms of product. However, when you prove `ensures square(product(a, b)) <= product(a, a) * product(b, b)` in `CauchyInequality` it does not seem to me that you are proving the inequality I defined, which is: `square(covariance(a, b)) <= covariance(a, a) * covariance(b, b)`. Since it seems that I am not understanding this, I will make the following proof: I will define `covariance` in terms of `product` and see whether lemma `square(covariance(a, b)) <= covariance(a, a) * covariance(b, b)` holds using your `CauchyInequality` in the body. Does this sound ok? – Theo Deep Oct 25 '22 at 12:27
  • 1
    Yes, it sounds ok. And seems to provable using CauchyInequality – Divyanshu Ranjan Oct 25 '22 at 12:31
  • Hi, I did this (see 'Elaborating on the answer' in the post) and it could not verify. Any idea why? – Theo Deep Oct 25 '22 at 13:26
  • Added how to verify cov inequality and removed some of preconditions on older lemmas which was not required. I would say try to chart out proof using paper and pencil first before doing verification using tools. It helps for example in your attempt you were passing wrong arguments. – Divyanshu Ranjan Oct 25 '22 at 15:05
  • 1
    I can't believe I've gone so long without understanding it: checking `square(product(a, b)) <= product(a, a) * product(b, b)` means checking `square(cov(x,y)) <= cov(x,x)*cov(y,y)`, because the latter is simply the former, but with both sides of the inequality divided by `n^2`. My goodness, I just misunderstood the definition of `product` over and over again. Thank you very much for your patience and explanations. It's all clear and verified. – Theo Deep Oct 25 '22 at 15:39
  • 1
    I understand that Dafny has within himself an axiom that says something like `(a/n)<=(b/n)` implies `a<=b`, because it is able to verify `cauchyInequality_usingCov()` only by calling `CauchyInequality()`. Thanks again!! – Theo Deep Oct 25 '22 at 15:39
  • Hi, sorry for asking again, but I think there is an `assume` that cannot be proven. In `CauchyInequality` we encounter three assumes: `assume product(xs, xs) >= 0.0`, `assume product(ys, ys) >= 0.0` and `assume product(xs, ys) >= 0.0`. The first two are provable, since they multiply with themselves (thus they are always positive): indeed, this is the reason why `variance(x)` or `covariance(x,x)` is always positive. However, the last seems not true, `product(xs, ys) >= 0.0` can be negative (like `covariance(x,y)` with `x!=y` can be negative). Am I right? – Theo Deep Oct 27 '22 at 22:31
  • 1
    Yes, But that assume is not required. As HelperLemma, does n't have predcondition that z >= 0.0. Let me remove it – Divyanshu Ranjan Oct 28 '22 at 05:15
  • I see! Removed it too, thanks again. Of course, it could be the case the we do not accept negative covariance(a,b), but this would be stated with a symbolic ensures that cannot be proven.. or we have to add an absolute value. Thanks again! – Theo Deep Oct 28 '22 at 07:56
  • Hi again, I think HelperLemma cannot be proven, I provided a counter-example in (see the addition of "Proving HelperLemma" in the post). – Theo Deep Oct 28 '22 at 12:32
  • 1
    I have added verification of HelperLemma. One thing I noticed when verifying that I needed a >= 0.0 and b >= 0.0 as precondition of HelperLemma. But that is already provided as all elements of both seq are positive. I have modified part of code to reflect that. Regarding your counterexample you can't choose x and y to be 0 as square(z) <= x * y. So x * y must be greater than 1. – Divyanshu Ranjan Oct 28 '22 at 14:05
  • Hi, thanks a lot for the proof. As for "One thing I noticed when verifying that I needed a >= 0.0 and b >= 0.0 as precondition of HelperLemma", I thinks this is not so. I mean, given an original sequence s, a[0] is (s[0]-a_mean), which can be negative, even if all the elements of s are positive. For instance, given s=[2,4], mean is 3 and a[0]=(2-3), which is negative. – Theo Deep Oct 28 '22 at 15:13
  • 1
    You are right. It is possible to prove this when a < 0 or b < 0. I have updated my answer. Cheers – Divyanshu Ranjan Oct 28 '22 at 15:39
  • 1
    I was about to say that: eliminating that requires yields no problems. Cheers back! – Theo Deep Oct 28 '22 at 15:43