1

I know this is a very basic question but my code keeps failing when trying to run what I found through the help documentation.

Up to now I have been running an analysis project off of the .WORK directory which I understand gets wiped out every time a session ends. I have done a bunch of data cleaning and preparation and do not want to have to do that every time before I start my analysis.

So I understand, from reading this: https://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/viewer.htm#a001310720.htm that I have to output the cleaned dataset to a non-temporary directory.

Steps I have taken so far: 1) created a new Library called "Project" 2) Saved it in a folder that I have under "my folders" in SAS 3) My code for saving the cleaned dataset to the "Project" library is as follows:

PROC SORT DATA=FAA_ALL NODUPKEY;
BY GROUND_SPEED;
DATA PROJECT.FAA_ALL;
RUN;

Then I run this code in a new program:

PROC PRINT DATA=PROJECT.FAA_ALL;
RUN;

It says there are no observations and that the dataset is essentially empty.

Can some tell me where I'm going wrong?

nojohnny101
  • 514
  • 7
  • 26

2 Answers2

4

Your problem is the PROC SORT

PROC SORT DATA=FAA_ALL NODUPKEY;
BY GROUND_SPEED;
DATA PROJECT.FAA_ALL;
RUN;

Should be

PROC SORT DATA=FAA_ALL OUT= PROJECT.FAA_ALL NODUPKEY;
BY GROUND_SPEED;
RUN;

That DATA PROJECT.FAA_ALL was starting a Data Step creating a blank data set.

DomPazz
  • 12,415
  • 17
  • 23
2

Something else worth mentioning: your data step didn't do what you might have expected because you had no set statement. Your code was equivalent to:

PROC SORT DATA=WORK.FAA_ALL NODUPKEY;
BY GROUND_SPEED;
RUN;

DATA PROJECT.FAA_ALL;
 SET _NULL_;
RUN;

PROJECT.FAA_ALL is empty because nothing was read in.

The SORT procedure implicitly sorts a dataset in-place. You could have SAS move the sorted data by adding the set statement to your data step:

PROC SORT DATA=WORK.FAA_ALL NODUPKEY;
BY GROUND_SPEED;
RUN;

DATA PROJECT.FAA_ALL;
 SET WORK.FAA_ALL;
RUN;

However, this still takes two steps, and requires extra disk I/O. Using the out option in a SAS procedure (as in DomPazz's answer) is almost always faster and more efficient than using a data step just to move data.

david25272
  • 976
  • 6
  • 12