Let's say I have a dataset that looks like this:
set.seed(2016)
d <- data.frame(x=rnorm(1000),
y=sample(x=c("A", "B", "C"), size=1000, replace=TRUE))
I have some method that selects a subset:
s <- data.frame(x=rnorm(100),
y=sample(x=c("A", "A", "B", "C"), size=100, replace=TRUE))
The subset has a different has a different distribution of y:
prop.table(table(d$y))
A B C
0.335 0.349 0.316
prop.table(table(s$y))
A B C
0.44 0.34 0.22
Given the classes, y, for the full data set, d, and the subset, s, how can I draw a sample from d with the same class distribution and size as s?
Preferably, I would like the results as vector of indices of d.