0

I have a dataset like this:

data have;
input date :date9. index;
format date date9.;
datalines;
31MAR2019 10
30APR2019 12
31MAY2019 15
30JUN2019 14
;
run;

I would like to add observations with dates from the maximum date (hence from 30JUN2019) until 31DEC2019 (by months) with the value of index being the last available value: 14. How can I achieve this in SAS? I want the code to be flexible, thus for every such dataset, take the maximum of date and add monthly observations from that maximum until DEC2019 with the value of index being equal to the last available value (here in the example the value in JUN2019).

doremi
  • 141
  • 3
  • 15
  • Will you ever want to do this with `BY` groups ? Is the data always presorted ? – Richard Feb 07 '20 at 10:42
  • Will you ever have gaps in the middle you need to fill in? For example what if the april observation was not there? Would you want to create one between the march and may observations? – Tom Feb 07 '20 at 14:06

2 Answers2

1

An explicit DO loop over the SET provides the foundation for a concise solution with no extraneous worker variables. Automatic variable last is automatically dropped.

data have;
input date :date9. index;
format date date9.;
datalines;
31MAR2019 10
30APR2019 12
31MAY2019 15
30JUN2019 14
;

data want;
  do until (last);
    set have end=last;
    output;
  end;
  do last = month(date) to 11;  %* repurpose automatic variable last as a loop index;
    date = intnx ('month',date,1,'e');
    output;
  end;
run;

Always helpful to refresh understanding. From SET Options documentation

END=variable
creates and names a temporary variable that contains an end-of-file indicator. The variable, which is initialized to zero, is set to 1 when SET reads the last observation of the last data set listed. This variable is not added to any new data set.

Richard
  • 25,390
  • 3
  • 25
  • 38
0

You can do it using end in set statement and retain statement.

data want(drop=i tIndex tDate);
   set have end=eof;
   retain tIndex tDate;
   if eof then do;
      tIndex=Index;
      tDate=Date;
   end;
   output;
   if eof then do;
      do i=1 to 12-month(tDate);
         index=tIndex;
         date = intnx('month',tDate,i,'e');
         output;
      end;
   end;
run;

INPUT:

+-----------+-------+
|   date    | index |
+-----------+-------+
| 31MAR2019 |    10 |
| 30APR2019 |    12 |
| 31MAY2019 |    15 |
| 30JUN2019 |    14 |
+-----------+-------+

OUTPUT:

+-----------+-------+
|   date    | index |
+-----------+-------+
| 31MAR2019 |    10 |
| 30APR2019 |    12 |
| 31MAY2019 |    15 |
| 30JUN2019 |    14 |
| 31JUL2019 |    14 |
| 31AUG2019 |    14 |
| 30SEP2019 |    14 |
| 31OCT2019 |    14 |
| 30NOV2019 |    14 |
| 31DEC2019 |    14 |
+-----------+-------+
Llex
  • 1,770
  • 1
  • 12
  • 27