0

I want to place the results of a lagged diff back into my data frame. It means I would have leading NAs for the different lags.

I am using:

new.df$lag1 <- diff(new.df$Close, lag = 1, differences = 1, arithmetic = TRUE, na.pad = TRUE)
Error in `$<-.data.frame`(`*tmp*`, lag1, value = c(0.248860000000001,  : 
  replacement has 6177 rows, data has 6178

I thought that if it said na.pad=TRUE then this would place a NA on row 1 and the lag diff on row 2. This is not the case.

heres some sample data:

data <- c(10,15,89,40,55,67,79)

lag1 <- diff(data, lag = 1, differences = 1, arithmetic = TRUE, na.pad = TRUE)

goal is to place this back into the dataframe... with leading NA's depending on number of lags.

Andrew Bannerman
  • 1,235
  • 2
  • 16
  • 36

1 Answers1

2
dta = c(10,15,89,40,55,67,79)

require(zoo)

apply(lag(zoo(dta), c(-1,0), na.pad = TRUE), 1L, diff)

#> apply(lag(zoo(dta), c(-1,0), na.pad = TRUE), 1L, diff)
#  1   2   3   4   5   6   7 
# NA   5  74 -49  15  12  12 

Also, try to avoid naming your objects with names already used by base R (like data)!


On May 10th 2018 it was pointed to me by @thistleknot (thanks!) that dplyr masks stats's own lag generic. Therefore make sure you don't have dplyr attached, or instead run stats::lag explicitly, otherwise my code won't run.

I think I found the culprit: github.com/tidyverse/dplyr/issues/1586 answer: This is a natural consequence of having lots of R packages. Just be explicit and use stats::lag or dplyr::lag

catastrophic-failure
  • 3,759
  • 1
  • 24
  • 43
  • used your answer above on dta... Error: `n` must be a nonnegative integer scalar, not double of length 2 – Andrew Bannerman Aug 11 '17 at 17:38
  • see this: https://stackoverflow.com/questions/45642435/rolling-lagged-differences – Andrew Bannerman Aug 11 '17 at 19:24
  • "Error: n must be a nonnegative integer scalar," I have the same issue at home, but had it working at work... hrm... – thistleknot May 09 '18 at 03:46
  • @thistleknot I have it working on R x64 3.3.0/3.3.1/3.4.0/3.5.0 on Windows 10. What about you? – catastrophic-failure May 09 '18 at 13:00
  • it's odd. sometimes I get it to work, sometimes I don't. I suspect somewhere in my script it's unloading the library? Not sure yet. – thistleknot May 09 '18 at 17:20
  • 2
    I think I found the culprit: https://github.com/tidyverse/dplyr/issues/1586 answer: This is a natural consequence of having lots of R packages. Just be explicit and use stats::lag or dplyr::lag – thistleknot May 09 '18 at 20:08
  • @thistleknot WOW. Thanks for this discovery! Ideally `dplyr` should stop masking base generics, but a lot of packages do that. Perhaps CRAN could become more emphatic regarding it. I'll update some of my answers crediting your find. Thanks again! – catastrophic-failure May 10 '18 at 12:30
  • @AndrewBannerman Please see the conversation and my edit on why it might not be running on your end. – catastrophic-failure May 10 '18 at 12:34