I'm using MySQL & PHP to calculate Pearson's R measuring the relationship over a number of years between political donations and tenders received by a particular business.
I've run into trouble with the MySQL query that is feeding values into the algorithm. The algorithm itself is fine and evaluates correctly. The problem is in the query used to get the data.
The formula I'm using for Pearson's R is at http://www.statisticshowto.com/how-to-compute-pearsons-correlation-coefficients/
Here is the basic MySQL query that spits out values for each year:
SELECT count( distinct year) as count,name,sum(donations), sum(tenders), sum(donations * tenders) as xy,(sum(donations)*sum(donations)) as x2, (sum(tenders)*sum(tenders)) as y2 from money_by_year where name='$name' group by name,year
Here is the query WITH ROLLUP to get only the final values:
SELECT count( distinct year) as count,name,sum(donations), sum(tenders), sum(donations * tenders) as xy,(sum(donations)*sum(donations)) as x2, (sum(tenders)*sum(tenders)) as y2 from money_by_year where name='$name' group by name with rollup LIMIT 1
The problem is that the totals from the second query are wrong in the sum xy, x2 & y2. This is being caused by the query itself, probably the ROLLUP and I'd like to know what is going on with it.
You can see working examples of the code with the values resulting from both the above queries and the algorithm at https://openaus.net.au/follow_the_money.php?name=KPMG
I have tried various changes to sum(donations * tenders) as xy for example implementing it as sum(donations) * sum(tenders) as in:
SELECT count( distinct year) as count,name,sum(donations), sum(tenders), sum(donations) * sum(tenders) as xy,(sum(donations)*sum(donations)) as x2, (sum(tenders)*sum(tenders)) as y2 from money_by_year where name='KPMG' group by name with rollup LIMIT 1
however the ROLLUP totals are incorrect, much bigger than they should be. The values I want may not be possible via a single MySQL query however I would appreciate knowing why this is the case, what the ROLLUP is doing to the figures and why.