According to the title, I make a simple example to test drop_na {tidyr}
:
library(tidyr)
library(dplyr)
# (1.) produce a dataset with two POSIX type "ct" and "lt"
data <- data.frame(n = 1:5)
data$ct <- as.POSIXct(Sys.time() + rnorm(5) * 1000)
data$lt <- as.POSIXlt(Sys.time() + rnorm(5) * 1000)
str(data)
# $ n : int 1 2 3 4 5
# $ ct: POSIXct, format: "2018-10-07 03:02:28" ...
# $ lt: POSIXlt, format: "2018-10-07 02:37:26" ...
# (2.) assign the third values of "ct" and "lt" to NA
data[3, c("ct", "lt")] <- NA
# (3.) use different function to remove rows with NA
data %>% is.na() # identify NAs in both "ct" and "lt"
data %>% drop_na('ct') # drop NA from "ct"
data %>% drop_na('lt') # NOT drop NA from "lt"
data[c(1, 2)] %>% na.omit() # drop NA from "ct"
data[c(1, 3)] %>% na.omit() # NOT drop NA from "lt"
From the conclusion above, if there are NAs in the POSIX-lt variables, only is.na()
can be used to drop rows with NAs.
I approximately know the difference between POSIX "ct" and "lt".
POSIXct
represents the number of seconds since the beginning of 1970 as a numeric vector.POSIXlt
is a named list of vectors representing.
So someone can explain why POSIXlt
's missing values cannot be identified by drop_na()
and na.omit()
?