I'm trying to create a Moran's scatterplot using PySAL -- the one with HH/HL/LH/LL quadrants -- and think I've got there but would like to check my understanding/interpretation/code. The code below uses the built-in North Carolina SIDS data set and row-standardisation.
import numpy as np
import pysal as ps
import matplotlib.pyplot as plt
import matplotlib.cm as cos
# shpdir is wherever the PySAL example data are installed
col = 'SIDR74'
w = ps.open(os.path.join(shpdir,"sids2.gal")).read()
f = ps.open(os.path.join(shpdir,"sids2.dbf"))
y = np.array(f.by_col(col))
w.transform = 'r'
### Are these next three steps right? ###
# Calculate the spatial lag
yl = ps.lag_spatial(w, y)
# Z-Score standardisation
yt = (y - y.mean())/y.std()
ylt = (yl - yl.mean())/yl.std()
# Elements of a Moran's I Scatterplot
# X-axis = z-standardised attribute values
# Y-axis = z-standardised lagged attribute values
# Quadrants = HH=1, LH=2, LL=3, HL=4
#
# So from that it follows that:
# HH == ylt > 0 and yt > 0 = 1
# LH == ylt > 0 and yt < 0 = 2
# LL == ylt < 0 and yt < 0 = 3
# HL == ylt < 0 and yt > 0 = 4
# Initialise an array with a default
# value to hold the quadrant information
quad = np.zeros(yt.shape)
quad[np.bitwise_and(ylt > 0, yt > 0)]=1 # HH
quad[np.bitwise_and(ylt > 0, yt < 0)]=2 # LH
quad[np.bitwise_and(ylt < 0, yt < 0)]=3 # LL
quad[np.bitwise_and(ylt < 0, yt > 0)]=4 # HL
plt.scatter(yt, ylt, c=quad, cmap=cms.summer)
plt.suptitle("Moran Scatterplot?")
plt.show()
That produces something that seems reasonable, but I think I've thought myself into knots on the basis that I've not actually calculated Moran's I yet (via ps.Moran_Local(...)
) and this is called a Moran scatterplot...