I wrote a small function that would count the number of NA, NaN an Inf in a tibble data frame as follows:
check.for.missing.values <- function(df) {
return( sum(is.na(as.matrix(df)) & !is.nan(as.matrix(df))) + #NAs
sum(is.infinite(as.matrix(df))) + #Infs
sum(is.nan(as.matrix(df))) #NaNs
)}
I tested it with the following tibble:
x1 <- tibble(x = 1:7,
y = c(NA,NA,Inf,Inf,Inf,-Inf,-Inf),
z = c(-Inf,-Inf,NaN,NaN,NaN,NaN,NaN))
x1
# A tibble: 7 × 3
x y z
<int> <dbl> <dbl>
1 1 NA -Inf
2 2 NA -Inf
3 3 Inf NaN
4 4 Inf NaN
5 5 Inf NaN
6 6 -Inf NaN
7 7 -Inf NaN`
And I get
check.for.missing.values(x1)
[1] 14
which of course is the correct answer.
Now, if the tibble that I pass on to the function happens to include observations in date format, then the functions stops working and I can't figure out why:
x2 <- mutate(x1, date = as.Date('01/07/2008','%d/%m/%Y'))
x2
# A tibble: 7 × 4
x y z date
<int> <dbl> <dbl> <date>
1 1 NA -Inf 2008-07-01
2 2 NA -Inf 2008-07-01
3 3 Inf NaN 2008-07-01
4 4 Inf NaN 2008-07-01
5 5 Inf NaN 2008-07-01
6 6 -Inf NaN 2008-07-01
7 7 -Inf NaN 2008-07-01`
check.for.missing.values(x2)
[1] 7
Any clues as to what's going on?
Thanks
reyemarr