0

I want to implement dropout i.e. randomly replace elements with NAs and return the vector/list back with NA values.

The wanted result is to reduce overfitting so there may be better convenience functions for this.

Approach 1

b<-rnorm(100); b[match(sample(b,10),b)] <- NA

where 10 values replaced with NA.

Approach 2. how to remove 90% of population? Not working, getting less than 90%:

b<-rnorm(99); b[match(sample(b,length(b)*0.9),b)] <- NA

that actually does not work because of possible matches i.e. selecting the same element.

Is there any builtin or convenience function for dropout?

hhh
  • 50,788
  • 62
  • 179
  • 282

1 Answers1

1

Use function is.na<- to assign NA values to a vector.

set.seed(1)
b <- rnorm(100)
b[match(sample(b, 10), b)] <- NA

set.seed(1)
b2 <- rnorm(100)
is.na(b2) <- sample(length(b2), 10)

identical(b, b2)
#[1] TRUE

In order to implement the removal of 90% of population with NA's, sample based on the length of the vector to be processed.

set.seed(1)
b <- rnorm(100)
is.na(b) <- sample(length(b), 0.9*length(b))
mean(is.na(b))
#[1] 0.9
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66