Dropout in R: randomly remove elements and replace them with NA

Question

I want to implement dropout i.e. randomly replace elements with NAs and return the vector/list back with NA values.

The wanted result is to reduce overfitting so there may be better convenience functions for this.

Approach 1

b<-rnorm(100); b[match(sample(b,10),b)] <- NA

where 10 values replaced with NA.

Approach 2. how to remove 90% of population? Not working, getting less than 90%:

b<-rnorm(99); b[match(sample(b,length(b)*0.9),b)] <- NA

that actually does not work because of possible matches i.e. selecting the same element.

Is there any builtin or convenience function for dropout?

Function `is.na<-` as in `is.na(b) <- sample(length(b), 10)`. — Rui Barradas, Oct 10 '18 at 15:57

Rui Barradas · Accepted Answer · 2018-10-10T16:27:37.857

1

Use function is.na<- to assign NA values to a vector.

set.seed(1)
b <- rnorm(100)
b[match(sample(b, 10), b)] <- NA

set.seed(1)
b2 <- rnorm(100)
is.na(b2) <- sample(length(b2), 10)

identical(b, b2)
#[1] TRUE

In order to implement the removal of 90% of population with NA's, sample based on the length of the vector to be processed.

set.seed(1)
b <- rnorm(100)
is.na(b) <- sample(length(b), 0.9*length(b))
mean(is.na(b))
#[1] 0.9

edited Oct 10 '18 at 16:27

answered Oct 10 '18 at 16:00

Rui Barradas

70,273
8
34
66

How would you implement the removal of 90% of population with NAs? – hhh Oct 10 '18 at 16:08

Dropout in R: randomly remove elements and replace them with NA

1 Answers1