This is only a Numpy solution, hope it suffices.
I assume that your function is made up of an operation on the matrix elements and of a mean, i.e. a scaled sum. Hence, it is sufficient to look at Y
as in
Y = np.power(X-X0, 2)
So we only need to deal with determining a windowed mean. Note that for the 1D case the matrix product with an appropriate vector of ones can be determined for calculating the mean, e.g.
h = np.array([0, 1, 1, 0]) # same dimension as y
m1 = np.dot(h, y) / 2
m2 = (y[1] + y[2]) / 2
print(m1 == m2) # True
The 2D case is analogous, but with two matrix multiplications, one for the rows and one for the columns. E.g.
m_2 = np.dot(np.dot(h, Y), h) / 2**2
To construct a sliding window, we need to build a matrix of shifted windows, e.g.
H = [[1, 1, 1, 0, 0, ..., 0],
[0, 1, 1, 1, 0, ..., 0],
.
.
.
[0, ..., 0, 0, 1, 1, 1]]
to calculate all the sums
S = np.dot(np.dot(H, Y), H.T)
A full example for a (n, n)
matrix with a (m, m)
window would be
import numpy as np
n, m = 500, 10
X0 = np.ones((n, n))
X = np.random.rand(n, n)
Y = np.power(X-X0, 2)
h = np.concatenate((np.ones(m), np.zeros(n-m))) # window at position 0
H = np.vstack((np.roll(h, k) for k in range(n+1-m))) # slide the window
M = np.dot(np.dot(H,Y), H.T) / m**2 # calculate the mean
print(M.shape) # (491, 491)
An alternative but probably slightly less efficient way for building H
is
H = np.sum(np.diag(np.ones(n-k), k)[:-m+1, :] for k in range(m))
Update
Calculating the mean squared deviation is also possible with that approach. For that, we generalize the vector identity |x-x0|^2 = (x-x0).T (x-x0) = x.T x - 2 x0.T x + x0.T x0
(a space denotes a scalar or matrix multiplication and .T
a transposed vector) to the matrix case:
We assume W
is a (m,n)
matrix containing a block (m.m)
identity matrix, which is able to extract the (k0,k1)
-th (m,m)
sub-matrix by Y = W Z W.T
, where Z
is the (n,n)
matrix containing the data. Calculating the difference
D = Y - X0 = Y = W Z W.T - X0
is straightforward, where X0
and D
is a (m,m)
matrix. The square-root of the squared sum of the elements is called Frobenius norm. Based on those identities, we can write the squared sum as
s = sum_{i,j} D_{i,j}^2 = trace(D.T D) = trace((W Z W.T - X0).T (H Z H.T - X0))
= trace(W Z.T W.T W Z W.T) - 2 trace(X0.T W Z W.T) + trace(X0.T X0)
=: Y0 + Y1 + Y2
The term Y0
can be interpreted as H Z H.T
from the method from above.The term Y1
can be interpreted as a weighted mean on Z
and Y2
is a constant, which only needs to be determined once.
Thus, a possible implementation would be:
import numpy as np
n, m = 500, 10
x0 = np.ones(m)
Z = np.random.rand(n, n)
Y0 = Z**2
h0 = np.concatenate((np.ones(m), np.zeros(n-m)))
H0 = np.vstack((np.roll(h0, k) for k in range(n+1-m)))
M0 = np.dot(np.dot(H0, Y0), H0.T)
h1 = np.concatenate((-2*x0, np.zeros(n-m)))
H1 = np.vstack((np.roll(h1, k) for k in range(n+1-m)))
M1 = np.dot(np.dot(H1, Z), H0.T)
Y2 = np.dot(x0, x0)
M = (M0 + M1) / m**2 + Y2