0

I wanted to make a small example for the explanation of my raking model, but the code below does not seem to work.

I don't understand why it doesn't, since I just did (for all I know) exactly the same a few days ago.

Maybe I overlooked something?

The code I made:

require(pacman)
p_load(tidyverse, anesrake, weights, purrr)

leeftijd <- c(0.35, 0.40, 0.25)
school <- c(0.20, 0.30, 0.35, 0.15)

dat <- data.frame(leeftijd = c('jong', 'middel', 'oud', 
                               'jong', 'middel', 'oud', 
                               'jong', 'middel', 'oud',
                               'jong', 'middel', 'oud'),
                  school   = c('mbo1-2','mbo1-2','mbo1-2', 
                               'mbo3-4','mbo3-4','mbo3-4',
                               'hbo','hbo','hbo', 
                               'wo','wo','wo'),
                  perc     = c(0.06,0.06,0.03,
                               0.08,0.10,0.10,
                               0.9,0.10,0.9,
                               0.03,0.14,0.08))
dat$leeftijd <- as.factor(dat$leeftijd)
dat$school <- as.factor(dat$school)

target <- list(leeftijd, school)
names(target) <- c("leeftijd", "school")
levels(target$leeftijd) <- c('jong', 'middel', 'oud')
levels(target$school) <- c('mbo1-2', 'mbo3-4', 'hbo', 'wo')


raking <- anesrake(target,                       
                   dat,                           
                   dat$caseid,                   
                   cap = 10,                     
                   choosemethod = "total",       
                   type = "pctlim",              
                   pctlim = 0.05                
)

I checked all the classes, structures, you name it. Maybe I forgot something?

If I make R calculate the weights like this, it works:

target <- with(dat, list(
  leeftijd = wpct(leeftijd, perc),
  school = wpct(school, perc)
))

but these are not the numbers I want. Why doesn't it work with leeftijd and school?

Thanks in advance!

stefan
  • 90,330
  • 6
  • 25
  • 51
  • I think `levels(target$...) <- ... ` should be `names(target$...) <- ... `. According to the docs (and the examples): `Targets for factors must be labeled to match every level present in the dataframe (e.g. a variable with 2 age groups "under40" and "over40" should have elements named "under40" and "over40" respectively).` – stefan Jul 12 '23 at 20:34
  • That's it! Thank you so much! If you want to make your comment an answer, I can accept it :) – Myrthe Kroes Jul 13 '23 at 07:25
  • It does work now, but I get the error 'No variables are off by more than 1 percent' even tho that's really not the case. Do you know why this happens? I tried everything from this post: https://stackoverflow.com/questions/65891210/anesrake-error-no-variables-are-off-by-more-than-when-they-are – Myrthe Kroes Jul 13 '23 at 07:46
  • Second issue is that your data has no column named `caseid`, i.e. after doing `dat$caseid <- seq_len(nrow(dat))` the raking works fine. – stefan Jul 13 '23 at 19:51
  • 1
    It does! Thank you so much. I don't know why it didn't click in a smaller example. Thank you for your help :) – Myrthe Kroes Jul 14 '23 at 07:06
  • If I add the weights to `dat` by using `dat$gewicht <- raking$weightvec`, the weights don't make sense. If I multiply the values from `dat` with the weights, the sum of the rows and columns don't match the marginals. Why is that? – Myrthe Kroes Jul 14 '23 at 09:20

0 Answers0