6

Does anyone know how to use BRR weights in Lumley's survey package for estimating variance if your dataset already has BRR weights it in?

I am working with PISA data, and they already include 80 BRR replicates in their dataset. How can I get as.svrepdesign to use these, instead of trying to create its own? I tried the following and got the subsequent error:

dstrat <- svydesign(id=~uniqueID,strata=~strataVar, weights=~studentWeight, 
                data=data, nest=TRUE)
dstrat <- as.svrepdesign(dstrat, type="BRR")

Error in brrweights(design$strata[, 1], design$cluster[, 1], ..., 
    fay.rho = fay.rho,  : Can't split with odd numbers of PSUs in a stratum

Any help would be greatly appreciated, thanks.

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
RickyB
  • 607
  • 1
  • 8
  • 20
  • to work with pisa in R, [this](http://www.asdfree.com/search/label/program%20for%20international%20student%20assessment%20%28pisa%29) will do the correct setup for you :) you'll need to incorporate the multiple imputation in your analysis, which those scripts automate. – Anthony Damico Dec 09 '13 at 12:56
  • You could also try the `RALSA` package which has a graphical user interface: https://cran.r-project.org/package=RALSA For guides on how to use it, see here: http://ralsa.ineri.org/user-guide/ – panman Nov 04 '21 at 14:19

2 Answers2

4

no need to use as.svrepdesign() if you have a data frame with the replicate weights already :) you can create the replicate weighted design directly from your data frame.

say you have data with a main weight column called mainwgt and 80 replicate weight columns called repwgt1 through repwgt80 you could use this --

yoursurvey <-
    svrepdesign( 
    weights = ~mainwgt , 
    repweights = "repwgt[0-9]+" , 
    type = "BRR", 
    data = yourdata ,
    combined.weights = TRUE
)

-- this way, you don't have to identify the exact column numbers. then you can run normal survey commands like --

svymean( ~variable , design = yoursurvey )

if you'd like another example, here's some example code and an explanatory blog post using the current population survey.

ako
  • 3,569
  • 4
  • 27
  • 38
Anthony Damico
  • 5,779
  • 7
  • 46
  • 77
  • is it not necessary to specify `type` and `rho` arguments? I suppose one needs to know the specific design of the replicate weights to know? – ako Nov 06 '12 at 08:13
  • 1
    i think type defaults to "BRR" but not 100%. rho is only needed for type = "Fay" [example](https://github.com/ajdamico/usgsd/blob/9be971706a69f4e569a7d5643eab602a5c27b311/Current%20Population%20Survey/2012%20asec%20-%20analysis%20examples.R#L80). :) – Anthony Damico Nov 06 '12 at 12:10
  • this answer is not sufficient if you use any of the "plausible values" variables (which are probably central to any analysis). to use those correctly, use [this setup instead](http://www.asdfree.com/search/label/program%20for%20international%20student%20assessment%20%28pisa%29). – Anthony Damico Jan 04 '14 at 11:28
2

I haven't used the PISA data, I used the svprepdesign method last year with the Public Use Microsample from the American Community Survey (US Census Bureau) which also shipped with 80 replicate weights. They state to use the Fay method for that specific survey, so here is how one can construct the svyrep object using that data:

pums_p.rep<-svrepdesign(variables=pums_p[,2:7],
    repweights=pums_p[8:87],
    weights=pums_p[,1],combined.weights=TRUE,
    type="Fay",rho=(1-1/sqrt(4)),scale=1,rscales=1)

attach(pums_p.rep)
#CROSS - TABS
#unweighted
xtabs(~ is5to17youth + withinAMILimit) 
table(is5to17youth + withinAMILimit)

#weighted, mean income by sex by race for select age groups
svyby(~PINCP,~RAC1P+SEX,subset(
   pums_p.rep,AGEP > 25 & AGEP <35),na.rm = TRUE,svymean,vartype="se","cv")

In getting this to work, I found the article from A. Damico helpful: Damico, A. (2009). Transitioning to R: Replicating SAS, Stata, and SUDAAN Analysis Techniques in Health Policy Data. The R Journal, 1(2), 37–44.

ako
  • 3,569
  • 4
  • 27
  • 38