The problem
I have two arrays, we'll call them ar1 and ar2 (size (192,289)), that represent lat-lon maps of standard deviations, and I have an similarly-sized array of their difference. I want to plot the difference, and on top a stippling pattern where the difference between the two arrays is statistically significant to the 95% confidence level (alpha = 0.05).
The code
I was using this example for my coding-
How do I do a F-test in python
I used Joel Cornett's solution, substituting ar1
and ar2
in for X and Y.
F = np.var(ar1) / np.var(ar2)
print np.var(ar1), np.var(ar2)
print F
0.118586507371 0.161485609461
0.734347213766
For the next part, I want N-2 degrees of freedom for my analysis, where N is the number of points in the arrays, in this case 55848 (192 x 289). len(ar1)
and len(ar2)
won't work here since those only give the length of the first dimension, so I tried flattening the arrays for the correct length.
df1 = len(np.ndarray.flatten(sdmod)) - 2
df2 = len(np.ndarray.flatten(sdcon)) - 2
print df1, df2
55486 55486
However, going forward with this I ended up with a p-value of 9.88365269356e-289 (essentially 0). This is a single value and, as I expected in this particular case, statistically insignificant, but I need an array of values in order to do the stippling so I can see if there's any place on the grid where the difference IS significant. I'm just not sure how to perform this test on a 2-D array since all the examples I'm finding use lists or other 1-D data types, and I also just have never done an analysis like this before. (I'm doing it at the request of my advisor, who doesn't use Python).
The Question
How do you perform an f-test on a two 2-D arrays where the result gives a similarly-sized array that gives you a p-value for each grid point?
I can amend this if possible to fill in anything I might be missing due to lack of understanding of the subject (and let me know is the p-value I got doesn't seem right), but if this it too complex or incomplete to get help on, I'll just delete it.