0

Recently I have came across the package validate in R which is very useful when you want to validate a full data set with pre-defined rules, say for example:

v <- validator(
Species.na = !is.na(Species),
Species.range = Species %in% c("setosa", "versicolor", "virginica"),
Sepal.Width.na = !is.na(Sepal.Width),
Sepal.Width.range = Sepal.Width >= 2 & Sepal.Width <= 4,
Sepal.Length.relation = Sepal.Length/Petal.Length < 4)

valied <- confront(iris, v)

Now I was wondering if something similar is available with missing value imputation. There are packages like mice, mi etc. which are really nice but imputation methods are standard, not user defined or custom. Can anyone suggest if there is anyway to set some pre-defined missing functions and apply them to a R data.frame. Something which could work like :

m <- missing(
Species.na = if(is.na(Species)) Species <- "setosa"
Sepal.Width.na = if(is.na(Sepal.Width)) Sepal.Width <- 3.5)

mi <- confront(iris, m)
  • How is this different than `ifelse(is.na(iris$Species), "setosa", iris$Species)` etc? – alexwhitworth May 20 '16 at 21:20
  • The `mice` package actually allows setting up user-defined functions for imputations. There is an article about `mice` in the Journal of Statistical Software that illustrates the general procedure. You can also have a look at the functions already implemented in the `mice` package. They have a common structure. – SimonG Jun 12 '16 at 13:00
  • I was able to use the validator() function along with the confront() function to accomplish my objective. This package is really cool. – Rakesh Poduval Oct 19 '16 at 09:36

0 Answers0