Is it possible to create the data object in Python for SPSS

Question

I have a python script that is reading in an XML file into an array (in a CSV format I created). I'd like to be able to use that data directly instead of saving to a file.

Is this possible? So it would be like creating a Var.File node but instead of pointing to a file it is taking the data I have already pulled in.

Eg. data[0] = "1,A,B,C" # single line of all documents.

This is off-topic here, but would IMO be on topic at SO. I wrote a pertinent blog post, http://andrewpwheeler.wordpress.com/2014/09/19/turning-data-from-python-into-spss-data/ — Andy W, Sep 30 '14 at 13:14
Oh ok. Thanks. Feel free to close. Your blog looks exactly like what I was looking for. If you want to post it as an answer for a point let me know. — Simon O'Doherty, Sep 30 '14 at 13:23

score 3 · Accepted Answer · answered Sep 30 '14 at 14:01

In a nutshell, you can paste your Python program in between BEGIN PROGRAM and END PROGRAM blocks directly within an SPSS syntax file. Then you can define an SPSS dataset and append cases to that dataset with the Python code block.

What is potentially nice about this is that it can be done line by line, so can process quite large files in theory. Even with tiny files it should be faster than the write and read the csv files. Example below taken from a blog post I wrote on the subject:

BEGIN PROGRAM Python.
import spss

MyData = [(1,2,'A'),(4,5,'B'),(7,8,'C')] #make a list of lists for the data

spss.StartDataStep()                   #start the data setp
MyDatasetObj = spss.Dataset(name=None) #define the data object
MyDatasetObj.varlist.append('X1',0)    #add in 3 variables
MyDatasetObj.varlist.append('X2',0)
MyDatasetObj.varlist.append('X3',1)
for i in MyData:                       #add cases in a loop
  MyDatasetObj.cases.append(i)
spss.EndDataStep()
END PROGRAM.

Is it possible to create the data object in Python for SPSS

1 Answers1