1

I have a one-column dataframe, and want to determine if the values are increasing (1) or decreasing (-1), and when no change is found return the last calculation done. I think the code I have should do it, but dplyr returns an error saying "object" "not found", and I presume it is because it is itself. Any thought on how can this be done?

df <- data.frame(Val = c(1:5,5,5,5:1,1,1,1,6,1,1,5:1))

df %>%
  mutate(ValDirection = ifelse(Val > lag(Val, 1), 1,
                               ifelse(Val < lag(Val, 1), -1, lag(ValDirection, 1))))

Desire results should be:

df <- data.frame(Val = c(1:5,5,5,5:1, 1,1,1,6,1,1,5:1),
                 ValDirection = c(1,1,1,1,1,1,1,1,-1,-1,-1,-1,-1,-1,-1,1,-1,-1,1,-1,-1,-1,-1))
Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
Camilo
  • 153
  • 7
  • 1
    You're defining and testing against `ValDirection` in the same `mutate` statement. This doesn't work because it doesn't exist yet. You'll have to calculate the lag first and then do a second sweep. – thelatemail Apr 28 '23 at 01:09
  • thank you @thelatemail ; indeed that is the problem I want to fix. I tried running a second mutate, based on the first result but that still generates a problem when you have more than two consecutive equal values. `df=df %>% mutate(ValDirection = ifelse(Val>lag(Val,1), 1, (ifelse(Val% mutate(ValDirection2 = ifelse(is.na(ValDirection), lag(ValDirection), ValDirection))` – Camilo Apr 28 '23 at 01:22
  • Deleted my answer as I missed the part about returning "the last calculation done". Sorry for the misunderstanding. Thankfully the answer @Darren Tsai achieves what you want. – L Tyrone Apr 28 '23 at 02:14

1 Answers1

2

The error occurs because you call ValDirection before it has been defined. You can replace lag(ValDirection, 1) with NA and use tidyr::fill() to fill in missing values with the previous value.

library(dplyr)

df %>%
  mutate(ValDirection = ifelse(Val > lag(Val, 1), 1, ifelse(Val < lag(Val, 1), -1, NA))) %>%
  tidyr::fill(ValDirection)

You can also use case_when() from dplyr to replace the nested ifelse():

df %>%
  mutate(ValDirection = case_when(Val > lag(Val, 1) ~ 1, Val < lag(Val, 1) ~ -1)) %>%
  tidyr::fill(ValDirection)

An alternative idea is:

df %>%
  mutate(ValDirection = na_if(sign(c(1, diff(Val))), 0)) %>%
  tidyr::fill(ValDirection)
Output
#    Val ValDirection
# 1    1            1
# 2    2            1
# 3    3            1
# 4    4            1
# 5    5            1
# 6    5            1
# 7    5            1
# 8    5            1
# 9    4           -1
# 10   3           -1
# 11   2           -1
# 12   1           -1
Darren Tsai
  • 32,117
  • 5
  • 21
  • 51