-3

My code is:

    df <- read.csv("data")
    summary(df)

    library(Hmisc)
    imp_a <- impute(df$a, mean)
    df$a <- imp_a
    imp_b <- impute(df$b, mean)
    df$b <- imp_b

If attribute not only a and b how too loop 1000 attributes? Thank you very much.

1 Answers1

0

Assuming your data as simple as this (but with more columns)...

df <- data.frame(a = c(0,1,NA), b = c(1,NA,2), c = c(NA,2,1))

...you can run mutate_all from the dplyr-package to apply Hmisc's mean imputation to all columns:

library(dplyr)

df %>% mutate_all(.funs = ~Hmisc::impute(.,mean))

    a   b   c
1 0.0 1.0 1.5
2 1.0 1.5 2.0
3 0.5 2.0 1.0

If there are columns you don't want to or cannot impute (e.g. character columns), you'd have to slightly adjust the code and probably switch to mutate_at, e.g.

df %>% 
  mutate_at(.vars = vars(a:c),
            .funs = ~Hmisc::impute(.,mean))
tifu
  • 1,352
  • 6
  • 17
  • Thank you @tifu for your answer. but if data frame have many different data type. how about it? example: a, b, c are int. and d, e are string – Lilis Gumilang Jul 03 '19 at 11:11
  • I am not clear what you mean. If you showed us a snippet of your real data, e.g. via `dput (df)`, it would be much easier to help you – tifu Jul 03 '19 at 13:11