-1

I've been working with SAS to get some rolling averages, and I can get it to give me those outputs but I want it done by date buckets if possible, which I don't quite know how to do.

Here's some sample data:

Date        Amount   User   Moving Avg
2019-01-01   100      XX1        100
2019-01-02   150      XX1        125
2019-01-03    50      XX1        100
2019-01-20    10      XX1        77.5

2019-01-15   125      XX2        125
2019-01-16    75      XX2        100
2019-01-18    50      XX2        83.3

What I would like is for the averages to be bucketed into date ranges

For example how can I make it show me the average amount for the following week buckets BY the user:

For user XX1: Dec 30th - Jan 5th (would be $100) Jan 20th - Jan 26th (would be $10)

For user XX2: Jan 13th to 19th (would be $83.3)

blargh
  • 29
  • 1
  • 8
  • How are you determining those time periods? How can we tell SAS those are the time periods of interest? Are they weeks? – Reeza Aug 01 '19 at 21:18
  • They're literally dates mate. 2019-01-01 = Jan 1st 2019. Not sure what more input I can provide on the inputted dates. The buckets I would like on a weekly basis. – blargh Aug 01 '19 at 21:20
  • Why does January 1 to 5 have to be one group and Jan 20 to 26? They're different amounts of days? How do you know to group those together? – Reeza Aug 01 '19 at 21:23
  • 1
    Ah that was just random, in reality it should be Dec 30th to Jan 5th since that is technically the full week. I will update the description accordingly thanks. – blargh Aug 01 '19 at 21:24
  • If you just want weeks, run PROC MEANS and apply a WEEK format to your date variable. There are several week formats available that you can use out of the box. Also, this is not considered a rolling date, usually rolling implies overlaps. There's an example here you can see, which you can run: https://gist.github.com/statgeek/0cae5568752959b035516d6ac07a20fb – Reeza Aug 01 '19 at 21:24
  • If you really do need a moving average, you can use PROC EXPAND https://gist.github.com/statgeek/07a3708dee1225ceb9d4aa75daab2c52 or an array https://gist.github.com/statgeek/27e23c015eae7953eff2 – Reeza Aug 01 '19 at 21:27

1 Answers1

1

You could use PROC FORMAT to define custom date ranges and then compute the averages you want.

proc format;
  value datebin
    '01Jan2019'd - '19Jan2019'd  = bin1
    '20Jan2019'd - '26Jan2019'd =  bin2
  ;
run;

proc means data = have ;
  var amt;
  class user d;
  output out = avgs mean = running_avg;
  format d datebin.;
run;

data want;
  set avgs;
  where _type_ = 3;
  keep user d running_avg;
run;

You could write a simple macro to generate the formats that systematically points to a weekly date range starting from any given date.

SAS2Python
  • 1,277
  • 5
  • 16
  • This seems perfect! I will look into the macro although I am not very good at them. – blargh Aug 01 '19 at 21:30
  • You don't need a macro, the WEEK function and/or format usually have a few ways to control where the week starts, it's rare to need a custom format for this unless you have non-standard 7 day weeks - ie business holidays shift your 'week'. But that often makes the data unusable since it's not comparable over time. – Reeza Aug 01 '19 at 22:03