I am trying to prove the following property in Dafny: covariance(x,y) <= variance(x) * variance(y)
, where x
and y
are real
.
First of all, I would like to know if I am interpreting covariance and variance correctly. Thus, I make the following question: is it the same proving covariance(x,y) <= variance(x) * variance(y)
and proving covariance(x,y) <= covariance(x,x) * covariance(y,y)
? I make this question because the variance usually appears to the power of 2 (i.e., sigma^2(x)
), which confuses me.
However, I understand from https://en.wikipedia.org/wiki/Covariance that this equivalence holds. Concretely, the following is stated: The variance is a special case of the covariance in which the two variables are identical.
I also take one of the definitions of covariance from that link. Concretely: cov(x,y)=(1/n)*summation(i from 1 to n){(x_i-mean(x))*(y_i-mean(y))}
.
Thus, assuming that I must prove covariance(x,y) <= covariance(x,x) * covariance(y,y)
and that covariance is defined like above, I test the following lemma in Dafny:
lemma covariance_equality (x: seq<real>, y:seq<real>)
requires |x| > 1 && |y| > 1; //my conditions
requires |x| == |y|; //my conditions
requires forall elemX :: elemX in x ==> 0.0 < elemX; //my conditions
requires forall elemY :: elemY in y ==> 0.0 < elemY; //my conditions
ensures covariance_fun (x,y) <= covariance_fun(x,x) * covariance_fun (y,y);
{}
Which cannot be proven automatically. The point is that I do not know how to prove it manually. I made this (not useful) approach of a contraction proof, but I guess this is not the way:
lemma covariance_equality (x: seq<real>, y:seq<real>)
requires |x| > 1 && |y| > 1; //my conditions
requires |x| == |y|; //my conditions
requires forall elemX :: elemX in x ==> 0.0 < elemX; //my conditions
requires forall elemY :: elemY in y ==> 0.0 < elemY; //my conditions
ensures covariance_fun (x,y) <= covariance_fun(x,x) * covariance_fun (y,y);
{
if !(covariance_fun(x,y) < covariance_fun(x,c) * covariance_fun (y,y)) {
//...
assert !(forall elemX' :: elemX' in x ==> 0.0<elemX') && (forall elemY' :: elemY' in y ==> 0.0<elemY'); //If we assume this, the proof is verified
assert false;
}
}
My main question is: how can I prove this lemma?
So that you have all the information, I offer the functions by which covariance calculation is made up. First, the covariance function is as follows:
function method covariance_fun(x:seq<real>, y:seq<real>):real
requires |x| > 1 && |y| > 1;
requires |x| == |y|;
decreases |x|,|y|;
{
summation(x,mean_fun(x),y,mean_fun(y)) / ((|x|-1) as real)
}
As you can see, it performs summation and then divides by the length. Also, note that the summation function receives mean of x
and mean of y
. Summation
is as follows:
function method summation(x:seq<real>,x_mean:real,y:seq<real>,y_mean:real):real
requires |x| == |y|;
decreases |x|,|y|;
{
if x == [] then 0.0
else ((x[0] - x_mean)*(y[0] - y_mean))
+ summation(x[1..],x_mean,y[1..],y_mean)
}
As you can see, it recursively performs (x-mean(x))*(y-mean(y))
of each position.
As for helper functions, mean_fun
calculates the mean and is as follows:
function method mean_fun(s:seq<real>):real
requires |s| >= 1;
decreases |s|;
{
sum_seq(s) / (|s| as real)
}
Note that is uses sum_seq
function which, given a sequence, calculates the sum of all its positions. It is defined recursively:
function method sum_seq(s:seq<real>):real
decreases s;
{
if s == [] then 0.0 else s[0] + sum_seq(s[1..])
}
Now, with all these ingredientes, can anyone tell me (1) if I already did something wrong and/or (2) how to prove the lemma?
PS: I have been looking and this lemma seems to be the Cauchy–Schwarz inequality, but still do not know how to solve it.
Elaborating on the answer
I will (1) define a convariance function cov
in terms of product and then (2) use this definition in a new lemma that should hold.
Also, note that I add the restriction that 'x' and 'y' are non-empty, but you can modify this
//calculates the mean of a sequence
function method mean_fun(s:seq<real>):real
requires |s| >= 1;
decreases |s|;
{
sum_seq(s) / (|s| as real)
}
//from x it constructs a=[x[0]-x_mean, x[1]-x_mean...]
function construct_list (x:seq<real>, m:real) : (a:seq<real>)
requires |x| >= 1
ensures |x|==|a|
{
if |x| == 1 then [x[0]-m]
else [x[0]-m] + construct_list(x[1..],m)
}
//covariance is calculated in terms of product
function cov(x: seq<real>, y: seq<real>) : real
requires |x| == |y|
requires |x| >= 1
ensures cov(x,y) == product(construct_list(x, mean_fun(x)),construct_list(y, mean_fun(y))) / (|x| as real) //if we do '(|x| as real)-1', then we can have a division by zero
//i.e., ensures cov(x,y) == product(a,b) / (|x| as real)
{
//if |a| == 1 then 0.0 //we do base case in '|a|==1' instead of '|a|==1', because we have precondition '|a| >= 1'
//else
var x_mean := mean_fun(x);
var y_mean := mean_fun(y);
var a := construct_list(x, x_mean);
var b := construct_list(y, y_mean);
product(a,b) / (|a| as real)
}
//Now, test that Cauchy holds for Cov
lemma cauchyInequality_usingCov (x: seq<real>, y: seq<real>)
requires |x| == |y|
requires |x| >= 1
requires forall i :: 0 <= i < |x| ==> x[i] >= 0.0 && y[i] >= 0.0
ensures square(cov(x,y)) <= cov(x,x)*cov(y,y)
{
CauchyInequality(x,y);
}
\\IT DOES NOT HOLD!
Proving HelperLemma (not provable?)
Trying to prove HelperLemma
I find that Dafny is able to prove that LHS of a*a*x + b*b*y >= 2.0*a*b*z
is positive.
Then, I started to prove different cases (I know this is not the best procedure):
(1) If one of a
,b
or z
is negative (while the rest positive) or when the three are negative; then RHS is negative
And when RHS is negative, LHS is greater or equal (since LHS is positive).
(2) If a
or b
or z
are 0
, then RHS is 0
, thus LHS is greater or equal.
(3) If a>=1
, b>=1
and z>=1
, I tried to prove this using sqrt_ineq()
.
BUT! We can find a COUNTER EXAMPLE.
With a=1
, b=1
, z=1
, RHS is 2; now, choose x
to be 0
and y
to be 0
(i.e., LHS=0). Since 0<2
, this is a counter example.
Am I right? If not, how can I prove this lemma?
lemma HelperLemma(a: real, x: real, b: real, y: real, z: real)
requires x >= 0.0 && y >= 0.0 //&& z >= 0.0
requires square(z) <= x * y
ensures a*a*x + b*b*y >= 2.0*a*b*z
{
assert square(z) == z*z;
assert a*a*x + b*b*y >= 0.0; //a*a*x + b*b*y is always positive
if ((a<0.0 && b>=0.0 && z>=0.0) || (a>=0.0 && b<0.0 && z>=0.0) || (a>=0.0 && b>=0.0 && z<0.0) || (a<0.0 && b<0.0 && z<0.0)) {
assert 2.0*a*b*z <=0.0; //2.0*a*b*z <= 0.0 is negative or 0 when product is negative
assert a*a*x + b*b*y >= 2.0*a*b*z; // When 2.0*a*b*z <= 0.0 is negative or 0, a*a*x + b*b*y is necessarily greater
}
if (a==0.0 || b==0.0 || z==0.0){
assert a*a*x + b*b*y >= 2.0*a*b*z;
}
else if (a>=1.0 && b>=1.0 && z>= 1.0){
assert a*a >= a;
assert b*b >= b;
assert a*a+b*b >= a*b;
assert square(z) <= x * y;
//sqrt_ineq(square(z),x*y);
//assert square(z) == z*z;
//sqrt_square(z);
//assert sqrt(z) <= sqrt(x*y);
//assert (x==0.0 || y == 0.0) ==> (x*y==0.0) ==> (square(z) <= 0.0) ==> z<=0.0;
assert a*a*x + b*b*y >= 2.0*a*b*z;
//INDEED: COUNTER EXAMPLE: a=1, b=1, z=1. Thus: RHS is 2; now, choose x to be 0 and y to be 0 (i.e., LHS=0). Since 0<2, this is a counter example.
}
}