0

Coincidentally, I found that my first column, a vector structured as POSIXct, has time gaps in it. My data set comprises observed values for each minute, however, for instance between 10:04:00 until 10:07:00 2 values are missing:

Date_time  
2016-05-11 10:02:00  
2016-05-11 10:03:00  
2016-05-11 10:04:00  
2016-05-11 10:07:00  
2016-05-11 10:08:00

I am working with a large data set and I would like to find out how many of those time gaps exists and at which position I can find them. I tried to work with the seq() command but I do not know how to use it for values of the class POSIXct. Thanks

  • 1
    What do you mean by "gap"? – amatsuo_net May 23 '17 at 10:42
  • I edited my post to clarify my problem. By time gap I mean, that the data set should have a value for each minute, but there are some gaps, that are larger than one minute, i.e. missing values. – Carolus Fridericus May 23 '17 at 12:35
  • have a look at `?diff.difftime()`...Maybe `which(diff.difftime(ddf$Date_time) != 1)` or `length(which(...))` to find out how many – Sotos May 23 '17 at 12:37
  • Do you mean the difftime() command? I have already tried that function but then I need a code which contains the difftime() for each element. The command would ask whether the time difference of two consecutive elements is larger than 1 and would ask where these elements are. How can I get to this? – Carolus Fridericus May 23 '17 at 13:04

1 Answers1

1

Some data.table solution:

library(data.table)
library(dplyr)
dt <- read.csv(text ='Date_time  
2016-05-11 10:02:00
2016-05-11 10:03:00
2016-05-11 10:04:00
2016-05-11 10:07:00
2016-05-11 10:08:00', as.is = T) %>% setDT()
dt[, Date_time := strptime(Date_time, "%Y-%m-%d %H:%M:%S")]
dt[, diff := Date_time - shift(Date_time)][, .N, by = diff]
##       diff N
## 1: NA mins 1
## 2:  1 mins 3
## 3:  3 mins 1
amatsuo_net
  • 2,409
  • 11
  • 20
  • Ok. I have got it now, though I only understand half of your code. I did it step by step like this: >dt <- read.csv(text ='Date_time 2016-05-11 10:02:00 2016-05-11 10:03:00 2016-05-11 10:04:00 2016-05-11 10:07:00 2016-05-11 10:08:00', as.is = T) >diff = wind.data$Date_time - shift(wind.data$Date_time) >which(diff) – Carolus Fridericus May 26 '17 at 12:36