8

In SAS 9, how can I in a simple data step, rearrange the order the field.

Data set2;
  /*Something probably goes here*/
  set set1;
run;

So if set1 has the following fields:

Name   Title   Salary
A      Chief   40000
B      Chief   45000

Then I can change the field order of set2 to:

Title  Salary  Name
Chief  40000   A
Chief  45000   B

Thanks,

Dan

Joe
  • 62,789
  • 6
  • 49
  • 67
Dan
  • 9,935
  • 15
  • 56
  • 66
  • 1
    Could you add a little more information as to why you want to do that? If its for formatting / printing i think you can handle that with proc print... – mcpeterson Apr 19 '10 at 20:39

4 Answers4

10

Some quick googling gave me this method:

data set2;
  retain title salary name;
  set set1;
  run;

from here: http://analytics.ncsu.edu/sesug/2002/PS12.pdf

mcpeterson
  • 4,894
  • 4
  • 24
  • 24
5

If you have a very large number of variables in your dataset sometimes it is easier to use an sql statement instead of a datastep. This allows you to list just the variables whose order you care about and use a wildcard to retain everything else.

proc sql noprint;
  create table set2 as
  select title, salary, *
  from set1;
quit;

If you are doing this with a large table you can save yourself the IO overhead by creating a view instead. This can be applied to both the data set approach or the proc sql approach.

proc sql noprint;
  create view set2 as
  select title, *
  from set1;
quit;

** OR;

data set2 / view=set2;
  retain title salary name;
  set set1;
run;

Cheers Rob

Robert Penridge
  • 8,424
  • 2
  • 34
  • 55
  • 1
    Just to be clear, all the methods listed in the link provided by @mcpeterson also have the behaviour you attribute to PROC SQL- one does not need to specify an ordering of all the variables in the dataset, only those of interest. The rest are still kept (unless otherwise specified elsewhere). – user667489 Mar 03 '12 at 20:14
3

You can also use an informat statement to do this - there is no need to specify any informats. I suspect this is slightly more efficient than an equivalent retain statement, as it allows SAS to initialise values to missing rather than retrieving them from the previous row. In practice the difference is minimal, and you also have the option of using a view.

data set2;
  informat title salary name;
  set set1;
run;

The variables specified in the informat statement are moved to the left of the dataset and into that order, and the rest are left as they were in the input dataset.

user667489
  • 9,501
  • 2
  • 24
  • 35
  • SAS doesn't work the way you think it does, as far as retain vs not retain. RETAIN doesn't retrieve values from the previous row; it actually causes SAS to not set a variable to missing between iterations, so if there were a difference in how this and the informat solution worked, it would be faster, not slower. I suspect there's zero difference between the two, as SAS would do exactly the same thing in both cases (define the variable in the PDV and stop.) – Joe Aug 05 '14 at 21:32
  • Yes, I got that bit wrong. However, I think there's still a case for informat vs. retain, as the former is less likely to cause any unintended behaviour, e.g. if the source dataset contains missing values. – user667489 Aug 05 '14 at 22:19
  • I'll tentatively agree with you, pending me remembering why it is that almost nobody uses `informat` (or `format`) this way, and almost everybody uses `retain`, despite the potential for unintended consequences... – Joe Aug 05 '14 at 22:21
1

You can use anything that initializes the PDV with variables in the order you want (ATTRIB, ARRAY, FORMAT, INFORMAT, LENGTH, RETAIN).

Source: This SAS note: http://support.sas.com/kb/8/395.html