1

I am having a very hard time leading or lagging an entire dataframe. What I am able to do is shifting individual columns with the following attempts but not the whole thing:

require('DataCombine')
df_l <- slide(df, Var = var1, slideBy = -1)

using colnames(x_ret_mon) as Var does not work, I am told the variable names are not found in the dataframe.

This attempt shifts the columns right but not down:

 df_l<- dplyr::lag(df)

This only creates new variables for the lagged variables but then I do not know how to effectively delete the old non lagged values:

 df_l<-shift(df, n=1L, fill=NA, type=c("lead"), give.names=FALSE)
Niccola Tartaglia
  • 1,537
  • 2
  • 26
  • 40

3 Answers3

7

Use dplyr::mutate_all to apply lags or leads to all columns.

df = data.frame(a = 1:10, b = 21:30)
dplyr::mutate_all(df, lag)
    a  b
1  NA NA
2   1 21
3   2 22
4   3 23
5   4 24
6   5 25
7   6 26
8   7 27
9   8 28
10  9 29
Ruben
  • 3,452
  • 31
  • 47
  • There seems to be a warning about usage of the function funs(): Warning message: funs() is soft deprecated as of dplyr 0.8.0 Please use a list of either functions or lambdas: # Simple named list: list(mean = mean, median = median) # Auto named with `tibble::lst()`: tibble::lst(mean, median) # Using lambdas list(~ mean(., trim = .2), ~ median(., na.rm = TRUE)) This warning is displayed once per session. – vlad1490 Nov 03 '19 at 14:48
3

I don't see the point in lagging all columns in a data.frame. Wouldn't that just correspond to rbinding an NA row to your original data.frame (minus its last row)?

df = data.frame(a = 1:10, b = 21:30)
rbind(NA, df[-nrow(df), ]);
#    a  b
#1  NA NA
#2   1 21
#3   2 22
#4   3 23
#5   4 24
#6   5 25
#7   6 26
#8   7 27
#9   8 28
#10  9 29

And similarly for leading all columns.

Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
  • Yes it would correspond to that, but I did not think of this solution. Why is it so uncommon to lag all columns in a dataframe? I using this for a regression and would like to lag all regressors. This works perfectly, thanks! – Niccola Tartaglia Mar 22 '18 at 20:28
  • 1
    No problem @NiccolaTartaglia; glad it helped. I'm still not clear on why you'd want to do that. Just use all predictor values minus the first row in your model, if that's what you need. I don't think this has anything to do with "lagging" values. – Maurits Evers Mar 22 '18 at 20:43
  • or `rbind(NA, head(df,-1))` – moodymudskipper Mar 22 '18 at 20:46
  • @Maurits: what you are suggesting absolutely works for me, I just did not think of that simple solution. And how would have described my problem other than calling it 'lagging' values? I had to describe my goal somehow otherwise no one would have understood what I am trying to accomplish, right? – Niccola Tartaglia Mar 22 '18 at 21:13
  • No worries @NiccolaTartaglia & fair enough. Glad you got there:-) – Maurits Evers Mar 22 '18 at 21:15
  • Yep, took me long enough. Thanks for the help Maurits and everyone else! – Niccola Tartaglia Mar 22 '18 at 21:23
2

A couple more options

data.frame(lapply(df, lag))

require(purrr)
map_df(df, lag)

If your data is a data.table you can do

require(data.table)
as.data.table(shift(df))

Or, if you're overwriting df

df[] <- lapply(df, lag) # Thanks Moody
require(magrittr)
df %<>% map_df(lag)
IceCreamToucan
  • 28,083
  • 2
  • 22
  • 38