0

I am trying to accomplish the following:

  1. I have a dataset with many variables and among them one called Gender, so with 2 levels "M" and "F".

  2. I want to sample without replacement this dataset, in let's say 1000 observations so I get equal number of "M" and "F", that is 500 each.

Bellow is the code I am trying. x is the dataset so x$gender is the variable column

test_sample<- x[sample(nrow(x),1000,replace = FALSE,prob = ?) ,]

Any idea how can I make this work ?

Arulkumar
  • 12,966
  • 14
  • 47
  • 68

1 Answers1

0

This should get close for some data.frame, df with variable Gender:

males <- which(df$Gender == "M")
females <- which(df$Gender == "F")

malesSampled <- sample(males, size=500)
femalesSampled <- sample(females, size=500)

dfSampled <- df[c(malesSampled, femalesSampled),]
lmo
  • 37,904
  • 9
  • 56
  • 69