Lagged difference between rows in R_ a different take

Question

My question is similar to a few that have been asked before, but I hope different enough to warrant a separate question.

See here, and here. I'll pull some of the same example data as these questions. For context to my question- I am looking to see how my observed catch-rate (sea creatures) changed over multiple days of sampling the same area.

I want to calculate the difference between the first sample day at a given site (first Letter in data below), and the subsequent sample days (next rows of same letter).

 #Example data   
 df <- data.frame(
 id = c("A", "A", "A", "A", "B", "B", "B"), 
 num = c(1, 8, 6, 3, 7, 7 , 9),
 What_I_Want = c(NA, 7, 5, 2, NA, 0, 2))

The first solution that I found calculates a lagged difference between each row. I also wanted this calculation- so it was helpful to find:

#Calculate lagged differences
df_new <- df %>% 
# group by condition
group_by(id) %>% 
# find difference
mutate(diff = num - lag(num))

Here the difference is between A.1 and A.2; then A.2 and A.3 etc...

What I would like to do now is calculate the difference with respect to the first value of each group. So for letter A, I would like to calculate 1 - 8, then 1 - 6, and finally 1 - 3. Any suggestions?

One clunky solution (linked above) is to create two (or more) columns for each distance lagged and some how merge the results that I want

df_clunky = df %>%
group_by(id) %>%
mutate(
deltaLag1 = num - lag(num, 1),
deltaLag2 = num - lag(num, 2))

@Hack-R yes thank you, I've been using abs() on my real data and thinking in terms of absolute difference, but you are correct — Kodiakflds, Jan 03 '17 at 20:55

score 1 · Accepted Answer · answered Jan 03 '17 at 20:19

Here is a base R method with replace and ave

ave(df$num , df$id, FUN=function(x) replace(x - x[1], 1, NA))
[1] NA  7  5  2 NA  0  2

ave applies the replace function to each id. replace takes the difference of the vector and the first element in the vector as its input and replaces NA into the first element.

Lagged difference between rows in R_ a different take

1 Answers1