1

I have a dataset for a survey that has several multiple choice questions (Check all, check 3, etc); Each option is coded as binary variables

 Location       Popn1 Popn2 Popn3 Popn4 .... Popn20
Location1        0     1    1     1
Location2        1     1    0     0 
Location3        0     0    0     0 

Here is my code:

 proc tabulate data=cath.binarydata;
 class location sectorcollapsed;
 var popn1-popn20;
 table (location='Location'), 
            (popn1-popn20)*(Sum='Count'*f=best5. mean='Percent'*f=percent8.1 N='Total responses received per question')
            /box="Populations Served by Location";

I'm using a proc tabulate to do a sum (count) and mean (percent) of each option in the multiple choice question by Location. However, I am finding that, when I do a check using my original dataset, the numbers don't make sense.

Here is a sample output: This is the kind of output I want and have right now

             Popn1                Popn2                    Popn3 ....... Popn20. 
             Count Freq   N       Count N    Freq

Location1     13   50%    26       11   42%    26
Location2

However, when I check back and manually calculate, what I think its doing doesn't make sense; for example, the N of 26 makes sense for location1, because there are 26 people in location1 and they all answered the question. So the sum being out of 26 makes sense.

However, for some of them, the N doesn't make sense - I thought the N would be all of the people who answered the question, but it doesn't quite add up like this. As an example, in one of the locations, there were 149 total people in that location, and 19 did not provide an answer at all - so the N here should be 130, but it is giving me a value of 134 in the output.

Does anyone have any thoughts or can help me understand how to use SAS to tabulate the multiple variables together in one column, while giving me the total answers for that option, and the percentage (out of the number of people who answered the question?) Any help is much appreciated,

Pre
  • 111
  • 7
  • 2
    Missing values. If any record is missing a value for any of the variables referred to anywhere in the PROC it is excluded from the full analysis. You need to explicitly add the MISSING option so that it is included in reports. – Reeza Nov 23 '20 at 21:57
  • 1
    Personally, I don't find that proc tabulate can handle conditional/flow questions well so I usually pre-summarize everything using PROC FREQs and then use PROC REPORT to display the results. It does require a lot of data wrangling to get the data in the right format for reporting though. – Reeza Nov 23 '20 at 21:59
  • 1
    yes you're right, it gave me the right numbers when i deleted all of the 0s from the cells where the rows added up to 0 - ie no answer for this question was received from this person – Pre Nov 23 '20 at 22:33

0 Answers0