0

I have two sets of gridded (NetCDF) data with dimensions: Time, S-N, W-E and would like to do a t-test of paired values in time across the entire grid with Python. The computation should return decisions across the entire grid that can then be processed into a spy plot aerial overlay. I am able to process this in MATLAB, after reading the variable, comprising two main steps (as a hint to what is expected in Python):

**[h,p] = ttest2(permute(grid_1,[3 2 1]),permute(griid_2,[3 2 1]))** % [3 2 1] ===> [Time S-N W-E]
**spy(squeeze(h),'k+',1)** % h is 1/0 decision.

Perhaps this would entail array transforms but my knowledge of Python is limited.

A snippet of my gridded data is as follows:

<xarray.DataArray 'WSPD' (Time: 120, south_north: 105, west_east: 120)>
array([[[3.0042849, 3.6635756, 3.5766048, ..., 2.7890186, 3.5537026,
         2.4510043],

Coordinates:

XLONG    (south_north, west_east) float32 -125.7 -125.7 ... -122.3 -122.2

XLAT     (south_north, west_east) float32 55.7 55.7 55.7 ... 56.99 56.99

XTIME    (Time) float32 3.003e+04 7.282e+04 ... 1.417e+06 3.742e+06

Time     (Time) datetime64[ns] 2010-08-01T08:00:00 ... 2020-07-01T08:00:00
Vinícius Félix
  • 8,448
  • 6
  • 16
  • 32
blackie
  • 43
  • 4
  • Not yet sure I understand. You have two datasets of equal dimensions and coordinates (`grid_1` and `grid_2`) and for each space coordinate, you'd like to perform a paired t-test for the two datasets across time? – Dahn Jul 15 '22 at 09:47
  • I posted an answer with the assumption that my understanding was correct. Let me please know in case I misunderstood. – Dahn Jul 15 '22 at 09:59

1 Answers1

0

You can perform paired t-test using scipy.stats.ttest_rel. It takes in two numpy arrays (to which xarray DataArrays natively converts) and compares them across an axis. Default is axis=0.

Example

Let's create some sample data

import xarray as xr
from scipy.stats import ttest_rel
import numpy as np

grid_1 = xr.tutorial.open_dataset('air_temperature').load().to_array().squeeze()
grid_2 = grid_1 + np.random.randn(*grid_1.shape)  # Normally perturbed grid_1
>>> grid_1
<xarray.DataArray (time: 2920, lat: 25, lon: 53)>
Coordinates:
  * lat       (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0
  * lon       (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
  * time      (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00
    variable  <U3 'air'

Since we want to compare across time, we need to specify the correct axis. Here time is first, so we do not need to do anything. Otherwise we'd have to either grid_1.transpose('time', 'lat', 'lon') or call ttest_rel with the appropriate axis argument.

Now we can simply perform the paired t-test.

statres, pval = ttest_rel(grid_1, grid_2)

Plotting

Following the details added to the question, here is how to plot a spy plot with the appropriate ticks.

import matplotlib.pyplot as plt

spy1 = np.where(pval < 0.05, 1, 0)

# Plot spy plot
ax = plt.gca()
ax.spy(spy1, markersize=1, color='black')  # 'black' markers

# Prepare ticks
x_ticks = np.arange(0, grid_1.coords['lon'].shape[0], 10)
x_tick_labels = grid_1.coords['lon'].isel(lon=x_ticks).values

y_ticks = np.arange(0, grid_1.coords['lat'].shape[0], 10)
y_tick_labels = grid_1.coords['lat'].isel(lat=y_ticks).values

# Set ticks
ax.set_xticks(x_ticks)
ax.set_xticklabels(x_tick_labels)

ax.set_yticks(y_ticks)
ax.set_yticklabels(y_tick_labels)

enter image description here

Dahn
  • 1,397
  • 1
  • 10
  • 29
  • Thanks for the solution @Dahn. Please see my comment below for progress and what is expected for the graphics. – blackie Jul 17 '22 at 00:18
  • @blackie I have updated my answer to reflect the added details. By the way, a better way to add details to the question is to edit the question — that way, by reading the question, people get the whole context. Please also upvote the answer if you've found it useful. – Dahn Jul 17 '22 at 08:46
  • Thanks @Dahn. My target is not for a standalone, rather, to plot the dots as an overlay on a contourf. I get 'AttributeError: 'Line2D' object has no property 'cmap'. How could this be achieved? Perhaps, the 1's have to be mapped to their specific latitude/longitude positions, then plotted on an existing plot of a variable from the netcdf. Please look at this code: https://geocat-examples.readthedocs.io/en/latest/gallery/Scatter/NCL_scatter_3.html#sphx-glr-gallery-scatter-ncl-scatter-3-py . It is similar to what I want, but with the assumption of an underlying contour fill. – blackie Jul 19 '22 at 02:48
  • https://wrf-python.readthedocs.io/en/latest/user_api/generated/wrf.xy_to_ll.html also hints at the method for retrieving lat-lon coordinates of the 1's in this case? – blackie Jul 19 '22 at 04:44
  • @blackie I am not sure how to reproduce your problem. It would be good to isolate it and post it as another question, as it's already outside the scope of this question. Feel free to post the link to the question as a comment here as well. – Dahn Jul 19 '22 at 20:06
  • So I figured it out. Essentially, I want to plot the spy on geo_axes which requires retrieving the lat-lon of the 1's. The first step is to filter for the 1's arrray, followed by separating the x and y grid indices, and the last is to derive the lat-lons from the y-x indices. These are the steps I used: grid_ones = np.argwhere(pval < 0.05) x_ones = grid_ones[:,1] y_ones = grid_ones[:,0] [x_coord,y_coord] = wrf.xy_to_ll(ncfile, x_ones, y_ones, meta=False) – blackie Jul 20 '22 at 04:38