I am fairly new to R and was trying to determine if I could use R to help fill in missing values in a number of large data sets I am working with. I'll try to explain it to the best of my abilities.
The data set I am working with has time data in the format HH:MM:SS. It is irregular in that no two data sets have the same time stamps, and the time stamp entries are recording an event over a 2 hour period. It looks something like this.
1. Date, Time_hms, Event
2. 9/22/2015, 00:00:00, 5
3. 9/22/2015, 00:00:24, 1
4. 9/22/2015, 00:00:24, 4
5. 9/22/2015, 00:01:42, 7
6. 9/22/2015, 00:02:04, 3
8. 9/22/2015, 00:02:35, 2
9. 9/22/2015, 00:03:02, 4
What I would like to do is add in missing rows at intervals of one minute, so that it looks like this.
1. Date, Time_hms, Event
2. 9/22/2015, 00:00:00, 5
3. 9/22/2015, 00:00:24, 1
4. 9/22/2015, 00:00:24, 4
5. 9/22/2015, 00:01:00, 4 # Summary row to be inserted
6. 9/22/2015, 00:01:42, 7
7. 9/22/2015, 00:02:00, 7 # Summary row to be inserted
8. 9/22/2015, 00:02:04, 3
9. 9/22/2015, 00:02:35, 2
10. 9/22/2015, 00:03:00, 2 # Summary row to be inserted
11. 9/22/2015, 00:03:02, 4
If possible, I would like the the rows to be filled in with the event that occurred during that range.
In trying to solve this, I found and tried this approach Insert rows for missing dates/times. I tried using POSIXct but was unsuccessful because of the date format. I have also considered padr and fill_by_function, but am uncertain if that is the correct approach. Is there a method to work strictly with HH:MM:SS format?
Again, I am only just learning R and am unsure of how to approach this. Any help or suggestions would be greatly appreciated!
Edit: Hopefully I did this correctly. Thank you again!
dput(elements)
structure(list(var1 = c("Date", "9/22/2015", "9/22/2015", "9/22/2015",
"9/22/2015", "9/22/2015", "9/22/2015", "9/22/2015"), var2 = c("Time_hms",
"00:00:00", "00:00:24", "00:00:24", "00:01:42", "00:02:04", "00:02:35",
"00:03:02"), var3 = c("Event", "5", "1", "4", "7", "3", "2",
"4")), .Names = c("var1", "var2", "var3"), row.names = c(NA,
8L), class = "data.frame")