-2

I am trying to create a sequence along two different parameters about how people have moved from one location to another. I have the following information

name<- c("John", "John", "John", "Sam","Sam", "Robert", "Robert","Robert")
location<- c("London", "London", "Newyork", "Houston", "Houston", "London", "Paris","Paris")
start_yr<- c(2012, 2012, 2014, 2014, 2014,2012,2013, 2013)
end_yr<- c(2013, 2013, 2015, 2015,  2015, 2013, 2015, 2015)

df<- data.frame(name,location,start_yr, end_yr)

I need to seq_along the name and location and create a transition variable of year to know if this person has moved in that year or not. I tried this but it didn't work very well. I was getting strange years meaning the name column sometimes doesn't start with 1. Any suggestions on how to approach this problem?

ave(df$name,df$location, FUN = seq_along)

I would like to have

   name location move year
   John London   1    2012
   John London   0    2013
   John Newyork  1    2014
   John Newyork  0    2015
Psidom
  • 209,562
  • 33
  • 339
  • 356
user3570187
  • 1,743
  • 3
  • 17
  • 34

1 Answers1

1

If I understand correctly, you could complete your dataframe by expanding it, for each name & location combination from the minimum start_yr to the maximum end_yr, then group by name and order by start_yr to check if location changed using lag():

library(dplyr)
library(tidyr)

df %>% 
  group_by(name, location) %>%
  complete(start_yr = full_seq(min(start_yr):max(end_yr), 1)) %>%
  group_by(name) %>%
  arrange(start_yr) %>%
  mutate(move = +(lag(location) != location))

This would return NA if, for a given name, there are no previous location, 0 if the location is the same and 1 if it changed:

#Source: local data frame [14 x 5]
#Groups: name [3]
#
#     name location start_yr end_yr  move
#   (fctr)   (fctr)    (dbl)  (dbl) (int)
#1    John   London     2012   2013    NA
#2    John   London     2012   2013     0
#3    John   London     2013     NA     0
#4    John  Newyork     2014   2015     1
#5    John  Newyork     2015     NA     0
#6  Robert   London     2012   2013    NA
#7  Robert   London     2013     NA     0
#8  Robert    Paris     2013   2015     1
#9  Robert    Paris     2013   2015     0
#10 Robert    Paris     2014     NA     0
#11 Robert    Paris     2015     NA     0
#12    Sam  Houston     2014   2015    NA
#13    Sam  Houston     2014   2015     0
#14    Sam  Houston     2015     NA     0
Steven Beaupré
  • 21,343
  • 7
  • 57
  • 77