1

I wish to use the function gls in the R package nlme to analyse a set of nested spatial samples, in which many samples overlap in at least some spatial coordinates. I want to account for non-independence in the response variable (the thing I'm measuring in each spatial sample) using either a corStruct or pdMat object, but I'm confused about how to do this.

I have generated a covariance matrix that should encode all the information about non-independence between spatial samples. Each row/column is a distinct spatial sample, the diagonal contains the total number of sampling units captured by each spatial sample, and the off-diagonal elements contain counts of sampling units shared between spatial samples.

I think I should use the nlme function gls while specifying a correlation structure, possibly using a corSymm or pdMat object. But I've only seen examples where the correlation structure in gls is specified via a formula. How can I use the covariance matrix that I've created?

Roger
  • 677
  • 2
  • 8
  • 19
  • If you have questions about the best way to model your data, you should ask at [stats.se]. This does not seem like a specific programming question that's appropriate for Stack Overflow. – MrFlick Oct 12 '17 at 16:10
  • I am asking how to specifically get the covariance structure into the nlme function gls, though. Should I be using corSymm? pdMat? – Roger Oct 12 '17 at 18:20

1 Answers1

3

I discovered that you can pass the nlme function gls a positive-definite correlation matrix by using the general correlation structure provided by corSymm.

# convert your variance covariance matrix into a correlation matrix
CM <- cov2cor(vcv_matrix)

# if your correlation matrix contains zeros, as mine did, you need to convert it to a positive-definite matrix that substitutes very small numbers for those zeros
CM <- nearPD(CM)$mat

# convert into a corStruct object using general correlation structure provided by corSymm
C <- corSymm(CM[lower.tri(CM)], fixed = T)

# correlation structure can now be included in a gls model
gls(y ~ x, correlation = C, method = "ML")
Roger
  • 677
  • 2
  • 8
  • 19
  • FWIW it's not "contains zeros" that you need to worry about with `nearPD()`, it's non-positive-definiteness (e.g. an identity matrix would be fine even though it contains lots of zeros ...) – Ben Bolker Jul 09 '20 at 15:50