I have a dataframe with deforestation data for 543 survey sites. It has 20 columns for the values in 2001-2020 (X1, X2 etc) and another 20 columns with a measure of population density for the same years (columns pop01, pop02 etc).
> str(grid10b )
'data.frame': 543 obs. of 45 variables:
$ X1 : int 0 0 0 0 0 0 0 0 0 0 ...
$ X2 : num 0.000889 0.000119 0.002048 0.00066 0.003605 ...
$ X3 : num 0.004645 0.000612 0.007276 0.002608 0.003475 ...
$ X4 : num 6.70e-04 8.07e-05 1.99e-03 1.19e-03 1.89e-03 ...
$ X5 : num 0.001447 0.000183 0.00314 0.001687 0 ...
$ X6 : num 0.000659 0.000115 0.002078 0.001113 0.000869 ...
...etc. I can merge the deforestation columns (thanks to the answer here: Reshaping longitudinal dataset with tmerge or SurvSplit?)
The code so far is:
grid10a <- grid10a %>%
tidyr::pivot_longer(cols = starts_with('X'), values_to = 'def') %>%
group_by(id) %>%
mutate(tstart = row_number(),
tstop = tstart+1) %>%
select(-name) # otherwise there's a column with X1, X2 etc which isn't needed
...this merges the 20 columns with deforestation values into a single column 'def' and gives me 20 rows for each site ID. So far so good.
But how can I merge the population density columns? I just need to add these into a 'population' column as they're in the same year order as the values I just tidied. I need to line up the values of X1 and pop01, X2 and pop02 and so on.
I tried this next:
grid10c <- grid10b %>%
tidyr::pivot_longer(cols = starts_with('pop'), values_to = 'popn') %>% group_by(id2)
...but ended up with a dataframe of 228,060 rows! The solution must be something like the first answer here: Reshaping multiple sets of measurement columns (wide format) into single columns (long format)
...but the use of 'names_to' and 'names_sep' isn't really explained.
Here's a dummy example of the sort of datafrae structure I have (df1) and the sort I want to build(df2):
df1 <- data.frame(ID = seq(1, 543),
X1 = runif(543, 0, 1),
X2 = runif(543, 0, 1),
X3 = runif(543, 0, 1),
X4 = runif(543, 0, 1),
X5 = runif(543, 0, 1),
X6 = runif(543, 0, 1),
X7 = runif(543, 0, 1),
X8 = runif(543, 0, 1),
X9 = runif(543, 0, 1),
X10 = runif(543, 0, 1),
X11 = runif(543, 0, 1),
X12 = runif(543, 0, 1),
X13 = runif(543, 0, 1),
X14 = runif(543, 0, 1),
X15 = runif(543, 0, 1),
X16 = runif(543, 0, 1),
X17 = runif(543, 0, 1),
X18 = runif(543, 0, 1),
X19 = runif(543, 0, 1),
X20 = runif(543, 0, 1),
pop01 = runif(543, 0, 100),
pop02 = runif(543, 0, 100),
pop03 = runif(543, 0, 100),
pop04 = runif(543, 0, 100),
pop05 = runif(543, 0, 100),
pop06 = runif(543, 0, 100),
pop07 = runif(543, 0, 100),
pop08 = runif(543, 0, 100),
pop09 = runif(543, 0, 100),
pop10 = runif(543, 0, 100),
pop11 = runif(543, 0, 100),
pop12 = runif(543, 0, 100),
pop13 = runif(543, 0, 100),
pop14 = runif(543, 0, 100),
pop15 = runif(543, 0, 100),
pop16 = runif(543, 0, 100),
pop17 = runif(543, 0, 100),
pop18 = runif(543, 0, 100),
pop19 = runif(543, 0, 100),
pop20 = runif(543, 0, 100))
df2 <- data.frame(ID = rep(1:543,each = 20),
def = runif(10860, 0, 1),
popn = runif(10860 , 0, 100))