I have a big data.frame where I can find a municipality id for each person(id) over time. Sometimes they move and there will be a new municipality_id from the next day on. However, sometimes the municipality_id stays the same. I would like to collapse those date intervals by each id if there is no real change in municipality_id
This data
id municipality_id from to
1 A 820 2007-01-01 2007-02-28
2 A 200 2007-03-01 2100-01-01
3 B 820 2007-01-01 2007-03-31
4 B 820 2007-04-01 2007-05-31
5 B 830 2007-06-01 2008-01-31
6 B 830 2008-02-01 2100-01-01
7 C 700 2007-01-01 2007-05-31
8 C 500 2007-06-01 2008-12-31
9 C 700 2009-01-01 2100-01-01
should turn into this (two observations for B were extended)
id municipality_id from to
1 A 820 2007-01-01 2007-02-28
2 A 200 2007-03-01 2100-01-01
3 B 820 2007-01-01 2007-05-31
4 B 830 2007-06-01 2100-01-01
5 C 700 2007-01-01 2007-05-31
6 C 500 2007-06-01 2008-12-31
7 C 700 2009-01-01 2100-01-01
Here the code to create my two tables:
data <- data.frame(id = c('A', 'A', 'B', 'B','B', 'B', 'C', 'C', 'C'),
municipality_id = c(820, 200, 820, 820, 830, 830, 700, 500, 700),
from = as.Date(c("2007-01-01", "2007-03-01", "2007-01-01", "2007-04-01", "2007-06-01", "2008-02-01", "2007-01-01", "2007-06-01", "2009-01-01")),
to = as.Date(c("2007-02-28", "2100-01-01", "2007-03-31", "2007-05-31", "2008-01-31", "2100-01-01", "2007-05-31", "2008-12-31", "2100-01-01")))
Should turn into:
data_edit <- data.frame(id = c('A', 'A', 'B', 'B', 'C', 'C', 'C'),
municipality_id = c(820, 200, 820, 830, 700, 500, 700),
from = as.Date(c("2007-01-01", "2007-03-01", "2007-01-01", "2007-06-01", "2007-01-01", "2007-06-01", "2009-01-01")),
to = as.Date(c("2007-02-28", "2100-01-01", "2007-05-31", "2100-01-01", "2007-05-31", "2008-12-31", "2100-01-01")))
Is there an easy solution with R? Thank you for helping me :)