0

so I am getting a strange output table/frequency chart that does not match up with my data. The variable PPETHMREC has been recoded (as shown in the code below) but despite the recoded values being present, the formatting is only reflecting a single outcome as opposed to both outcomes (i.e. the table is showing 100% non-hispanic when the data does not reflect that). Any suggestions? I feel like I am missing something incredibly obvious here. The other frequency table for PPRACEM is coming out as expected.

/* Read in the data and create a new dataset */
data work.ITDemogRecodes;
   set work.ITDemogRecodes;
   
/* Recode variables */
    if PPETHM = 1 then PPETHMREC = 1;
    else if PPETHM = 2 then PPETHMREC = 1;
    else if PPETHM = 3 then PPETHMREC = 1;
    else if PPETHM = 4 then PPETHMREC = 2;
    else if PPETHM = 5 then PPETHMREC = 1;
    
/* add labels to variables */
label PPETHM = "What is your Ethnicity?"
      PPRACEM = "What is your Race?"
      PPETHMREC = "What is your Ethnicity?";
      
/* create formats*/
proc format;
   value race
   1 = 'White'
   2 = 'Black or African American'
   3 = 'American Indian or Alaska Native'
   4 = 'Asian'
   5 = 'Native Hawaiian/Pacific Islander'
   6 = '2+ Races'
   7 = 'Other';
   
   value ethm
   1 = 'White, Non-Hispanic'
   2 = 'Black, Non-Hispanic'
   3 = 'Other, Non-Hispanic'
   4 = 'Hispanic'
   5 = '2+ Races, Non-Hispanic';
   
   value ethmrec
   1 = 'Non-Hispanic'
   2 = 'Hispanic';
   
/* format variables */
data work.ITDemogRecodes;
   set work.ITDemogRecodes;
   FORMAT PPETHM ethm.
          PPETHMREC ethmrec.
          PPRACEM race.;
   run;
   
/* create frequency chart */
proc freq data=work.ITDemogRecodes;
  tables PPETHMREC PPRACEM / out=freq_out;
  run;

I've tried reassigning different values, rewriting the format/recoding steps. No luck. Can't find the bug.

  • It is some issue with the "format variables" step, or with the creation of the formats to begin with – Noah Chirico Mar 27 '23 at 19:21
  • 1
    You keep overwriting the input data with your data steps. Perhaps some earlier attempt has changed the data. Can you get the original dataset back and start over? – Tom Mar 27 '23 at 19:26
  • Why do a recode? You should be able to have an additional format do the 'recode' at FREQ time and just table PPETHM and PPRACEM. – Richard Mar 28 '23 at 16:24

1 Answers1

1

Test your recoding using the following code:

Note, using the same input and output data set names makes it hard to debug your code.


/* Read in the data and create a new dataset */
data work.ITDemogRecodes2;
   set work.ITDemogRecodes (drop=PPETHMREC);
   
/* Recode variables */
    if PPETHM = 1 then PPETHMREC = 1;
    else if PPETHM = 2 then PPETHMREC = 1;
    else if PPETHM = 3 then PPETHMREC = 1;
    else if PPETHM = 4 then PPETHMREC = 2;
    else if PPETHM = 5 then PPETHMREC = 1;
run;

proc freq data=ITDemogRecodes2;
table PPETHM * PPETHMREC /list ;
run;
Reeza
  • 20,510
  • 4
  • 21
  • 38