I have a 2D data array and I'm trying to get a profile of values about its center in an efficient manner. So the output should be two one-dimensional arrays: one with the values of distances from the center, the other with the mean of all the values in the original 2D that are at that distance from the center.
Each index has a non-integer distance from the center, which prevents me from using some already known solutions for the problem. Allow me to explain.
Consider these matrices
data = np.random.randn(5,5)
L = 2
x = np.arange(-L,L+1,1)*2.5
y = np.arange(-L,L+1,1)*2.5
xx, yy = np.meshgrid(x, y)
r = np.sqrt(xx**2. + yy**2.)
So the matrices are
In [30]: r
Out[30]:
array([[ 7.07106781, 5.59016994, 5. , 5.59016994, 7.07106781],
[ 5.59016994, 3.53553391, 2.5 , 3.53553391, 5.59016994],
[ 5. , 2.5 , 0. , 2.5 , 5. ],
[ 5.59016994, 3.53553391, 2.5 , 3.53553391, 5.59016994],
[ 7.07106781, 5.59016994, 5. , 5.59016994, 7.07106781]])
In [31]: data
Out[31]:
array([[ 1.27603322, 1.33635284, 1.93093228, 0.76229675, -0.00956535],
[ 0.69556071, -1.70829753, 1.19615919, -1.32868665, 0.29679494],
[ 0.13097791, -1.33302719, 1.48226442, -0.76672223, -1.01836614],
[ 0.51334771, -0.83863115, -0.41541794, 0.34743342, 0.1199237 ],
[-1.02042539, 0.90739383, -2.4858624 , -0.07417987, 0.90748933]])
For this case the expected output should be array([ 0. , 2.5 , 3.53553391, 5. , 5.59016994, 7.07106781])
for the index of distances, and a second array of same length with the mean of all the values that are at those corresponding distances: array([ 0.98791323, -0.32496927, 0.37221219, -0.6209728 , 0.27986926, 0.04060628])
.
From this answer there is a very nice function to compute the profile about any arbitrary point. However, the problem with his approach is that it approximates the distance r
by the index distance. So his r
for my case would be this:
array([[2, 2, 2, 2, 2],
[2, 1, 1, 1, 2],
[2, 1, 0, 1, 2],
[2, 1, 1, 1, 2],
[2, 2, 2, 2, 2]])
which is a pretty big difference for me, since I'm working with small matrices. This approximation, however, allows him to use np.bincount
, which is pretty handy (but won't work for me).
I've been trying to expand this for float distance, like my version r
, but so far no luck. bincount
doesn't work with floats and histogram
needs equally-spaced bins, which is not the case. Any suggestion?