Does anyone know of a way to estimate Box-Cox multivariate transformations with survey data in R? I'm not aware of anything that takes into account strata and clusters (the data that I'm working with), but even something that takes into account probability weights would be great.I'm mostly worried about the fact that the distribution of one or more variables may change when probability weights are applied, so the transformation may change radically. There may also be implications for errors and the Box-Cox algorithm etc... but this is beyond what is basically a theory-confirmation approach.
Updated question
The R function powerTransform
works great, but I don't think there's anything yet for survey data. I thought Stata could handle this but as Nick pointed out this is not the case. The only Box-Cox transformation which handles sampling weights seems to be this.
Are you aware of any R function that allows you to apply both univariate and multivariate BoxCox transformations to probability weighted data?
I don't have any data but I was just wondering if anyone had found a solution to this. I know people appreciate when a specific example is given so...
Univariate Box-Cox: Results are returned for univariate Box-Cox when using lm and svyglm (survey package) objects.
library(survey)
data(api)
library(car)
dstrat<-svydesign(id=~1,strata=~stype, weights=~pw, data=apistrat, fpc=~fpc)
Sur<-svyglm(api00~mobility, design=dstrat)
NotSur<-lm(api00~mobility, data=apistrat)
powerTransform(Sur)
powerTransform(NotSur)
However I don't think the powerTransformation with the survey object is correct because you get the same results as NotSur (and different from Sur) when you run
None<-svydesign(id=~1, weights=rep(1,nrow(apistrat)), data=apistrat, )
Sur2<-svyglm(api00~mobility, design=None)
powerTransform(Sur2)
I'm even less sure about how you would find multivariate normality as you'd have to use actual data e.g.
summary(powerTransform(cbind(api00,mobility)~1,apistrat))