0

I want to create a new column that will label a certain amount of columns 1 and the remaining number of columns 0. If I want 20% of my columns to have this option I can get close by taking:

test$Rand_Num <- sample(1:5, nrow(test), replace=TRUE) test$Output <- ifelse(test$Rand_Num==1,1,0)

However I would like to be able to say if I had 1000 columns then 200 are randomly 1 and the rest are labeled as 0 that I can quickly change to 30%, etc.. for different scenarios.

Thanks!

Drthm1456
  • 409
  • 9
  • 17

1 Answers1

0

If you want to randomly select columns such that 20% (or some other percentage) of columns are randomly selected, you could return a vector of selected columns (for input data.frame df) with

p <- 0.2  # change me!
nselect <- round(p*ncol(df), 0)
whichcolumns <- sample(1:ncol(df), nselect)

To turn this into a vector of zeroes and ones, you could do something like

whichcolumns_01 <- rep(0, ncol(df))
whichcolumns_01[whichcolumns] <- 1

Admittedly, this has some duct tape in it, but it should work.

Matt Tyers
  • 2,125
  • 1
  • 14
  • 23