I have found this function and I would like to adapt it to replace outliers with NA instead of removing the observation.
I have tried to add <-NA
in this line data <- data[!outliers(data[[col]]),]
but I cannot make it work. Could you help me to adapt it, please?
Here you can find the code with some simulated data. Please let me know if you need something else.
Thank you so much in advance.
cov.matone <- matrix(c(1, .0,
.0, 1), nrow = 2)
data <- data.frame(MASS::mvrnorm(n = 1e4,
mu = c(4, 4),
Sigma = cov.matone))
outliers <- function(x) {
Q1 <- quantile(x, probs=.25, na.rm=T)
Q3 <- quantile(x, probs=.75, na.rm=T)
iqr = Q3-Q1
upper_limit = Q3 + (iqr*1.5)
lower_limit = Q1 - (iqr*1.5)
x > upper_limit | x < lower_limit
}
remove_outliers <- function(data, cols = names(data)) {
for (col in cols) {
data <- data[!outliers(data[[col]]),]
}
data
}
data_nooutliers <- remove_outliers(data, c('X1', 'X2' ))