For an if-query I would like to create a macro varibale giving the respective frequency of the underlying time series. I tried to get some descriptive statistics from proc time series. However, they unfortunately do not include the figure for the frequency. The underlying times series does not necessarily conclude all periods of the frequency. That excludes a selected count by proc sql from my point of view. Does anyone know an efficient procedure to determine the frequency without computing the frequency on my own (in a data step or a proc sql code)?
2 Answers
You can use the outspectra
statement to help learn what kind of seasonality it has. Based on the data, give PROC TIMESERIES your best guess of day, month, etc. In the example below, we know we want to forecast by month but we do not know what seasonality it has.
proc timeseries data=sashelp.air outspectra=spectra;
id date interval=month;
var air;
run;
Plot this spectra dataset in proc sgplot
and you'll see something that looks like this:
proc sgplot data=spectra;
where NOT missing(period);
series x=period y=p;
run;
This line will naturally increase over time, but we're looking for a bumps in the line. Notice the large bump somewhere between 0 and 24 months and the several smaller bumps before it. Let's zoom in on that by filtering out the longer periods.
proc sgplot data=spectra;
where period < 24 and NOT missing(period);
series x=period y=p;
run;
It's pretty clear that there is a strong seasonality of 12, with potentially smaller cycles at 3 and 6 months. From this plot, we can conclude that our seasonality should be 12 based on our spectra plot.
You can turn this into a macro to help identify the season if you'd like. Simply search for the largest bump within a reasonable timeframe. In our case we'll choose 36 because we do not suspect that we have any seasonality > 36 months.
proc sort data=spectra;
by period;
run;
data identify_period;
set spectra;
by period;
where NOT missing(period) AND period LE 36;
delta = abs(p - lag(p) );
run;
proc sql;
select period, max(delta) as max_delta
from identify_period
having delta = max(delta)
;
quit;
Output:
PERIOD max_delta
12 163712

- 10,597
- 1
- 12
- 21
-
Thanks for your quick response. I think this is a quite complicated code for what I actually would like to get. Therefore I would like to specify my query more clearly: I have a time series thatmay look like below and I would like to determine the frequency of it(either 4 or 12) in order to put it into a macro variable. Is there another less complicated approach? `data work.timeseries; input date date. value; format date date9. ; datalines; 01Oct18 3000 01Nov18 4000 01Dec18 6500 01Jan19 7000 01Feb19 4000 01Mar19 5000 01Apr19 7500 01May19 4800 01Jun19 4500 run;` – Juergen Feb 18 '22 at 16:23
-
Thank you for the clarification. Unfortunately I'm unsure of an easier way beyond outputting the spectral density plot and viewing it with sgplot, or automating it with the above logic. There are procs that can do this automatically but you'd need Forecast Server. – Stu Sztukowski Feb 18 '22 at 17:45
I don't know how to do this without data step logic, but you could wrap the data step in a macro as follows:
%macro get_frequency(data,date_variable,output_variable);
proc sort data=&data (keep=&date_variable) out=__tempsorted;
by &date_variable;
run;
data _null_;
set __tempsorted end=lastobs;
prevdate=lag(&date_variable);
if _n_ > 1 then do;
interval_number+1;
interval_total + (&date_variable - prevdate);
end;
if lastobs then do;
average_interval = interval_total/interval_number;
frequency = round(365.25/average_interval);
call symput ("&output_variable",left(put(frequency,best32.)));
end;
run;
proc datasets nolist;
delete __tempsorted;
run;
quit;
%mend get_frequency;
Then you can call the macro on your original data set timeseries to examine the variable date and create a new macro variable frequency1 with the required frequency.
data work.timeseries;
input date date. value;
format date date9.;
datalines;
01Oct18 3000
01Nov18 4000
01Dec18 6500
01Jan19 7000
01Feb19 4000
01Mar19 5000
01Apr19 7500
01May19 4800
01Jun19 4500
;
run;
%get_frequency(timeseries,date,freqency1)
%put &=frequency1;
This seems to work on your sample data where each date is the first of the month. If your dates are evenly distributed (e.g. always near month start/end, or always near mid-month etc.) then this macro should work ok. Obviously if you have multiple observations per date then it will give the completely incorrect frequency.

- 1,213
- 3
- 13