-1

I need to calculate so called MAT (Movie Anual Total), means the % change in sales value between same day in two different year:

    ID Sales Day Month Year
    A  500   31  12   2015
    A  100   1    1   2016
    A  200   2    1   2016

    ...      
    A  200   1    1   2017

Does anybody have an idea about how to deal with it?

I want to get this:

ID Sales Day Month Yeas **MAT**
lmo
  • 37,904
  • 9
  • 56
  • 69
aannie
  • 7
  • 2
  • 8
  • 2
    I don't think that the mat tag refers to what you think it does. – lmo Apr 14 '17 at 11:41
  • 2
    you should start by giving us some work you have done which needs help. Try writing code to select the appropriate 2 days worth of data and how you think you should calculate it, then put the code here, someone will definitely be able to help you refine it. – sconfluentus Apr 14 '17 at 11:44

2 Answers2

0

With the way your data is set up, you're actually quite close. What you want to do now is group your data by month and day, order each group by year, and then take the successive differences (assuming you want the MAT for sequential years)

library(lubridate)
library(dplyr)
X <- 
  data.frame(date = seq(as.Date("2014-01-01"), 
                        as.Date("2017-12-31"), 
                        by = 1)) %>%
  mutate(day = day(date),
         month = month(date),
         year = year(date),
         sales = rnorm(nrow(.), mean = 100, sd = 5))

X %>%
  group_by(month, day) %>%
  arrange(month, day, year) %>%
  mutate(mat = c(NA, diff(sales))) %>%
  ungroup()

If you are wanting to be able to generically take a difference between any two years, this will need some refinements.

Benjamin
  • 16,897
  • 6
  • 45
  • 65
0

Here is a solution with base R. Mainly it is a self-join:

d$prev.Year <- d$Year-1
dd <- merge(d,d, by.x=c("prev.Year", "Month", "Day"), by.y=c("Year", "Month", "Day"))
dd$MAT <- with(dd, (Sales.x-Sales.y)/Sales.y)

If you have different values in ID you eventually want:

dd <- merge(d,d, by.x=c("ID", "prev.Year", "Month", "Day"), by.y=c("ID", "Year", "Month", "Day"))

data:

d <- read.table(header=TRUE, text=
"ID Sales Day Month Year
A  500   31  12   2015
A  100   1    1   2016
A  200   2    1   2016
A  200   1    1   2017")
jogo
  • 12,469
  • 11
  • 37
  • 42