5

I would like to know what the difference is between using svyglm or a weighted glm.

For example:

M1 = glm(formula = yy ~ age + gender + country , 
         family = binomial(link = "probit"), 
         data = P2013, 
         subset = (P2013$E27>=14 & P2013$E27<=17), 
         weights = P2013$PESOANO)

or define sample design as:

diseño = svydesign(id =~ NUMERO, 
                   strata =~ ESTRATOGEO, 
                   data = p2013, 
                   weights = P2013$PESOANO)

diseño_per_1417 = subset(diseño, (P2013$E27>=14 & P2013$E27<=17))

and then use svyglm:

M2 = svyglm(formula = yy ~ age + gender + country, 
            family = quasibinomial(link = "probit"),
            data = P2013, 
            subset = (stratum=!0), 
            design = diseño_per_1417)

In the case that I use M2 (svyglm). What can I use to compare models like stepwise does for a glm model?

Thanks, Natalia

Natuk
  • 57
  • 1
  • 9

1 Answers1

3

From help(glm):

Non-NULL weights can be used to indicate that different observations have different dispersions (with the values in weights being inversely proportional to the dispersions); or equivalently, when the elements of weights are positive integers w_i, that each response y_i is the mean of w_i unit-weight observations. For a binomial GLM prior weights are used to give the number of trials when the response is the proportion of successes: they would rarely be used for a Poisson GLM.

I don't think that you are looking for those weights. From your example it seems you are dealing with a stratified survey. you should definitely use surveyglm.

Florian Oswald
  • 5,054
  • 5
  • 30
  • 38
  • Thanks Florian!, but do you know wich estimator use svyglm? or where I can find de development in a book? – Natuk May 02 '13 at 00:28
  • 2
    `svyglm` does some work related to your design (weights, strata etc) but eventually calls `glm`. So you should look in `?glm`. Basically by specifying `family` as you do in your example you set down which link function you want to use (which corresponds to "logit" or "probit" etc). A concise explanation is given by Kleiber&Zeileis 2008 (p122). Also check out the [website](http://staff.washington.edu/tlumley/survey/) of the survey package. Please accept my answer if you think that helps. – Florian Oswald May 02 '13 at 08:09
  • 1
    Yes already check in differents books and survey package but I can not find wich is the estimator that svyglm use. I think that the svyglm doesn't estimated by maximum likelihood, could be Horvitz Thomson estimator?I would like to find a text who explain in detail. Thanks! – Natuk May 02 '13 at 17:00