I need to compute TheilSen regression slope and intercept between values of two rasters in a GRASS GIS python script. The two rasters in this example (xtile and ytile) are both of the same dimensions 250x250 pixels and contain nodata (null) values. I have so far used grass.script alone, so I am new to scipy. I tried to read some tutorials and based on that came up with following code I tried on command-line:
>>> from scipy import stats
>>> import grass.script as grass
>>> import grass.script.array as garray
>>> x = garray.array(mapname="xtile")
>>> y = garray.array(mapname="ytile")
>>> res_list = stats.theilslopes(y, x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/dist-packages/scipy/stats/_stats_mstats_common.py", line 214, in theilslopes
deltax = x[:, np.newaxis] - x
MemoryError
Evidently, it would not be so simple. Edit: I removed my thoughts about the problem being dimensionality of the array, I was wrong. Now it seems the 250x250 array size is simply too much. Is it so? Any idea how to evade this?
Then there is another problem it seems. When I try to print the array x,
>>> print x
[[ 0. 0. 0. ... 0. 0. 0.]
[ 0. 0. 0. ... 0. 0. 0.]
[ 0. 402. 0. ... 0. 0. 0.]
...
[ 0. 0. 0. ... 0. 0. 0.]
[ 0. 0. 0. ... 0. 0. 0.]
[ 0. 0. 0. ... 0. 0. 0.]]
it appears all the nodata values from the raster got read as zeroes into the array. In the raster in question there is majority of nodata (or null as named in GRASS) pixels, which should be ignored in the regression, ie. if any value within the rasters x or y is nodata, corresponding pair of x,y data should not be used in the regression computation. Is it possible to define nodata values in the arrays so that these would be ignored directly in the manner described, or is it needded to first filter out the nodata pairs from the pair of arrays?
Thank you.