Quick way to count missing values in many variables

Question

I've created a small program to count the number of 'None' so missing values. But the issue is that it is very time consuming (around 6-7mn for 1000 variables, and this is way too much because I am working on a lot bigger datasets).

So I am looking for an alternative, and maybe someone can help here. Here is my program:

BEGIN PROGRAM.
import spss
vars=spss.GetVariableCount()

for i in range(vars):
    dataCursor=spss.Cursor([i])
    oneVar=dataCursor.fetchall()
    dataCursor.close()
    miss=str(oneVar)
    counter=miss.count('None')
    #print counter
print "done"
END PROGRAM.

I've also tried to replace:

counter=miss.count('None')

by

counter=miss.find('None')

but this is not changing anything. Is anyone able to help me here? I saw with google help this program:

begin program.
import spssdata
majors = []
for case in spssdata.Spssdata('mq1'):
    major = case[0]
    if major not in majors:
        majors.append(major)
print majors
end program.

but I am not able to make it run for all variables. Because when we have a 'None' it will always be listed in the first position of that list, I thought it might help to find a solution.

If anyone has any idea, I would be very grateful!

score 0 · Accepted Answer · answered Aug 24 '18 at 11:29

0

SPSS syntax aggregate command can do this easily:

aggregate ...... /nms1 to nms1000=nmiss(var1 to var1000).

answered Aug 24 '18 at 11:29

eli-k

10,898
11
40
44

Quick way to count missing values in many variables

1 Answers1