0

I need to create a variable which combined three variables age , sexe and agios. All of them are located in a sas table and all of them are numeric.

In first step, I transform them into categorical variables then I suggest to use the compress function to create this new variable:

data new;
set new;

attrib sexe format=$15.;
if sexe=1 then sexe="HOMME";
else if sexe=0 then sexe="FEMME";

attrib agios format=$15.;
if agios=0 then agios="NON_AGIOS";
else AGIOS="AGIOS";

attrib age format=$15.;
if (age<=0) and (age=>25) then age="a25";
if (age<=26) and (age=>40) then age="a40";
if (age<=41) and (age=>65) then age="a60";
if (age=>65) then age="a65";
new_variable=compress(agios||sexe||age);
run; 

  But I have an warnings repeted to all the concerned variables:

WARNING: Variable agios has already been defined as numeric.

And there wasn't any transformation for the variables in the table. Also, values of the concerned variables age , sexe and agios disappear from the table. It begins empty.

How do I correct this please? Or is there any other suggestion to create the new variable without using ifelse?

Thank you

A2018
  • 77
  • 1
  • 1
  • 4
  • Possible duplicate of [SAS : Convert character to numeric without creating another variable](https://stackoverflow.com/questions/27689318/sas-convert-character-to-numeric-without-creating-another-variable) – user667489 Jan 23 '18 at 14:32

2 Answers2

2

This issue is in this line:

if agios=0 then agios="NON_AGIOS";

The compiler initialises agios to be a numeric variable. You cannot then assign it a character value. One way around this is to create a new variable (agios2) then rename it, eg:

data new;
set new(rename=(agios=agios2 sexe=sexe2 age=age2)); /* rename vars on input */

attrib sexe format=$15.;
if sexe2=1 then sexe="HOMME";
else if sexe2=0 then sexe="FEMME";

attrib agios format=$15.;
if agios2=0 then agios="NON_AGIOS"; /* reference the renamed variable */
else AGIOS="AGIOS";
drop agios2 age2 sexe2;             /* drop the renamed variables on output */

attrib age format=$15.;
if (age2<=0) and (age2=>25) then age="a25";
if (age2<=26) and (age2=>40) then age="a40";
if (age2<=41) and (age2=>65) then age="a60";
if (age2=>65) then age="a65";
new_variable=compress(agios||sexe||age);
run; 
Allan Bowe
  • 12,306
  • 19
  • 75
  • 124
  • thank you I appreciate your answer, but it works only for the `agios` , how does I add the `rename` function for `age` and `sexe`?? – A2018 Jan 23 '18 at 14:57
  • It just doesn't work for the `age` variable, the column was empty and there is no transformation? I decide to place `drop agios2 age2 sexe2;` after and before `new_variable=compress(agios||sexe||age);` and it doesn't work too! – A2018 Jan 23 '18 at 15:24
  • I noticed a typo on age, which I updated, is it working now? – Allan Bowe Jan 23 '18 at 19:06
  • the placement of the drop statement is not relevant (it's only considered at compile time) – Allan Bowe Jan 23 '18 at 19:07
1

Another approach would be to create, for each variable, a sequenced format with the (NOTSORTED) option. The formats are used when reporting. For example in Proc TABULATE the FORMAT statement and CLASS statement with option order=formatted preloadfmt.

The age categorizations in your posted code appears incorrect

if (age<=0) and (age=>25) then age="a25";

The age value can not be simultaneously <= 0 and >= 25.

When labeling ranges I would recommend using values that are consistent with regards to the range they represent. For example sometimes you use the endpoint value in the label (a25, a40) and other times a middling point (a60). The scheme is inconsistent.

Here is sample code that applies formats to original data when being output in Proc TABULATE.

proc format;
  value sexe (notsorted)
    0 = 'HOMME'
    1 = 'FEMME'
  ;

  value agios (notsorted)
    0 = 'NON_AGIOS'
    1 = 'AGIOS'
  ;

  value age (notsorted)
     0< - 25 = 'a25'
    25< - 40 = 'a40'
    40< - 65 = 'a60' /* inconsistent labeling of range */
    65< - high = 'a65'
  ;

  * mock data;
  data have;
    do personid = 1 to 100;
      sexe = ranuni(123) < 0.40;
      agios = ranuni(123) > 0.65;
      age = floor(12 + 58 * ranuni(123));
      output;
    end;
  run;

  proc tabulate data=have;
    class sexe agios age / order=formatted preloadfmt;
    classlevel sexe agios age / style=[pretext='A0A0A0A0'x];
    format sexe sexe. agios agios. age age.;
    table
      ( sexe agios age ) 
    , N
    / nocellmerge
    ;
  run;

enter image description here

Sometimes creating new data with the formatted values becoming the actual data is appropriate, especially if handing off data to other users that may not have SAS and the formatting codes.

data ...;
  set ...;
  ...
  sexe_fv = put (sexe,sexe.);  * fv stands for formatted value;
  agios_fv = put (agios, agios.);
  age_fv = put (age, age.);
  ...
run;
Richard
  • 25,390
  • 3
  • 25
  • 38