11

I have a data.frame with a date-column. These dates can occur many times, but also zero-time:

        date value
1 2013-01-01     5
2 2013-01-01     3
3 2013-01-03     3
4 2013-01-04     3
5 2013-01-04     1
6 2013-01-06     1

How do I fill the date-gaps in this data.frame so I get the following?

        date value
1 2013-01-01     5
2 2013-01-01     3
3 2013-01-02     0
4 2013-01-03     3
5 2013-01-04     3 
6 2013-01-04     1
7 2013-01-05     0
8 2013-01-06     1

Any help is welcome.

TIA, Jerry

JerryWho
  • 3,060
  • 6
  • 27
  • 49

1 Answers1

18

You can merge your data.frame with another data.frame containg all the dates in sequence. here I assume that dat is your original data.frame.

hh<- data.frame(date=seq(as.Date("2013-01-01"), as.Date("2013-01-6"), by="days"))
>res <- merge(dat,hh,by.x='date',by.y='date',all.x=T,all.y=T)
        date value
1 2013-01-01     5
2 2013-01-01     3
3 2013-01-02    NA
4 2013-01-03     3
5 2013-01-04     3
6 2013-01-04     1
7 2013-01-05    NA
8 2013-01-06     1

Now we have NA for each row in dat that has no matching row in hh. Personaly, I think it is better to have NA to say that theses are missing values But you can set them to 0:

res$value[is.na(res$value)] <- 0

Edit

for generality you can generate hh as shown in @Arun solution:

      hh <- seq(min(dat$date), max(dat$date), by="days")
agstudy
  • 119,832
  • 17
  • 199
  • 261
  • 1
    @Arun thanks! I used your idea to generate the seq dates in general case. – agstudy Jan 20 '13 at 13:11
  • Thanks a lot. I was trying with a second data.frame, too. But I used always a df with date AND value column for the default values. So after merging I always got too many columns. :-( – JerryWho Jan 20 '13 at 13:51
  • @JerryWho I see , the key to use merge are the keys columns ..since you give the by.x and the by.y others columns doesn't matter.. – agstudy Jan 20 '13 at 13:54