Generate a differences of a unique observations accross a date range

Question

Ciao guys,

i have the following dataframe.

obj <- data.frame (occ= c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4),
Date = c("1990-01", "1990-01", "1990-01", "1990-01", "1990-02", "1990-02", "1990-02", "1990-02", "1990-03", "1990-03", "1990-03", "1990-03", "1990-04", "1990-04", "1990-04", "1990-04"),
                   emp_value = c(33, 0, 55, 44, 0, 50, 70, 80, 91, 32, 32, 22, 11, 31, 42, 51)
)

I would like to do the following:

I would like generate a variable which takes the difference in emp_value for every unique occupation (occ) between different dates.

My desired dataframe would be

obj <- data.frame (occ= c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4),
               Date = c("1990-01", "1990-01", "1990-01", "1990-01", "1990-02", "1990-02", "1990-02", "1990-02", "1990-03", "1990-03", "1990-03", "1990-03", "1990-04", "1990-04", "1990-04", "1990-04"),
               emp_value = c(33, 0, 55, 44, 0, 50, 70, 80, 91, 32, 32, 22, 11, 31, 42, 51), 
               emp_diff = c(0, 0, 0, 0, -33, 50, 15, 36, 91, -18, -38, -48, -69, -70, -1, 10)

)

Note that my real data frame consists of thousands of values and hundreds of different occupations. In addition, not every occupation appears within each date.

Many thanks in advance!

What happens, if one date is missing? Is you data.frame ordered by Date? — Martin Gal, Aug 07 '21 at 10:24
Yes it is ordered by date. Only occupations (variable occ) within a date can be missing. — freddywit, Aug 07 '21 at 10:27

Martin Gal · Accepted Answer · 2021-08-07T10:30:20.453

1

You could use dplyr:

library(dplyr)
obj %>%
  group_by(occ) %>%
  mutate(emp_diff = emp_value - lag(emp_value, default = 0))

edited Aug 07 '21 at 10:30

answered Aug 07 '21 at 10:23

Martin Gal

16,640
5
21
39

Thanks for your message! That was almost what I needed, I just had to replace lag(emp_diff, default = 0) by lag(emp_value , default = 0). Thanks man! – freddywit Aug 07 '21 at 10:29
Ah...my mistake.corrected it. – Martin Gal Aug 07 '21 at 10:30
If a occurence is missing, this one takes the differences between two consecutive dates. If one occurence is missing it subtractes for example `1900-01-01` and `1900-01-03`. – Martin Gal Aug 07 '21 at 10:34

Generate a differences of a unique observations accross a date range

1 Answers1