I have some panel data that looks like this (code to enter my dataset is at the end):
countrycode year X
1 ARG 2015 2
2 ARG 2016 2
3 ARG 2017 1
4 AUS 2015 1
5 AUS 2016 3
6 AUS 2017 2
7 USA 2015 6
8 USA 2016 5
9 USA 2017 8
And I'd like to difference the X variable (i.e. subtract last year's X from this year's X). It works perfectly when I don't use pipes:
library(tidyverse)
library(plm)
pdf <- pdata.frame(df, index = c("countrycode", "year"))
# This works perfectly
pdf <- mutate(pdf, dX = pdf$X - lag(pdf$X))
The results are exactly what I'd want: every 2015 value of dX is NA, because there is no 2014 value of X to compare with.
countrycode year X dX
1 ARG 2015 2 NA
2 ARG 2016 2 0
3 ARG 2017 1 -1
4 AUS 2015 1 NA
5 AUS 2016 3 2
6 AUS 2017 2 -1
7 USA 2015 6 NA
8 USA 2016 5 -1
9 USA 2017 8 3
But when I try to use %>% :
pdf <- pdf %>% mutate(dX2 = X - lag(X))
the results no longer take into account the panel structure. See how dX2 tries to difference right across countries? So dX2 for USA in 2015 should be NA, but instead it's 4.
countrycode year X dX dX2
1 ARG 2015 2 NA NA
2 ARG 2016 2 0 0
3 ARG 2017 1 -1 -1
4 AUS 2015 1 NA 0
5 AUS 2016 3 2 2
6 AUS 2017 2 -1 -1
7 USA 2015 6 NA 4
8 USA 2016 5 -1 -1
9 USA 2017 8 3 3
Is there some way to use pipes in plm or with panel data?
Full code here:
library(tidyverse)
library(plm)
df <- data.frame(stringsAsFactors=FALSE,
countrycode = c("ARG", "ARG", "ARG", "AUS", "AUS", "AUS", "USA", "USA",
"USA"),
year = c(2015L, 2016L, 2017L, 2015L, 2016L, 2017L, 2015L, 2016L,
2017L),
X = c(2L, 2L, 1L, 1L, 3L, 2L, 6L, 5L, 8L)
)
df
# using panel
pdf <- pdata.frame(df, index = c("countrycode", "year"))
# This works perfectly
pdf <- mutate(pdf, dX = pdf$X - lag(pdf$X))
pdf
# Pipe doesn't work across the panel
pdf <- pdf %>% mutate(dX2 = X - lag(X))
pdf