3

I've done some searches but still haven't found the answer to my problem. I have a large number of .csv files which I would like to convert to SPSS files. Say I have 1000 .csv files and I would like to have them all into 1000 SPSS files. I can do this file by file by asking SPSS to read the data from .csv and that costs a few clicks. However, since I have 1000 files, I'm looking for a way to do this without having to click a few thousand times and making lots of mistakes. I'm very new to programming in general so I would appreciate some for-dummy tips. Thanks a lot!

*Update: I've just included the links to an exemplar .csv and .sav file. csv file sav file All .csv files are all the same. They are data from the same experiment, but on different (human) subject.

TVV
  • 69
  • 6
  • Have you tried importing a simple CSV file via the Graphical User Interface and saving it as an SPSS file (preumably you mean `.SAV`) and looking inside it? How about posting a CSV and the corresponding SPSS file? – Mark Setchell Apr 15 '15 at 11:03
  • Hi Mark, thank you very much for your reply. I think what you said is exactly what I have been doing. I basically just open the .csv file from SPSS, click on "Open Data", and then specify in "Text Import Wizard" how I want the data to be in SPSS and then finally just save it as an .sav file. Yes, I indeed mean .sav files. I'm sorry I looked but it seems not possible to post an entire file on stackoverflow. Or do you mean I should upload the files somewhere (e.g. Dropbox) and then post the links? – TVV Apr 15 '15 at 11:36
  • You can paste a few lines of a simple CSV into your post, yes. And maybe a link to a Dropbox file for the `.sav` version, yes. You may find you have enough points to post a link now ;-) – Mark Setchell Apr 15 '15 at 11:38
  • Hi Mark, thank you for being so patient. I've just edited the post and included the 2 files. They should contain the same info, only that they have different formats. – TVV Apr 15 '15 at 11:55
  • Are the variables in all of your csv files the same, or are they different? – Andy W Apr 15 '15 at 12:44
  • Maybe this will help: https://www.gnu.org/software/pspp/manual/html_node/Invoking-pspp_002dconvert.html#Invoking-pspp_002dconvert – bf2020 Apr 15 '15 at 13:06
  • @AndyW: Yes the are all the same. They are data from the same experiment, but on different (human) subject. bf2020: Thank you! But I'm still clueless about what to do... Do I need to first install GNU? – TVV Apr 15 '15 at 15:40
  • I think you can get it here... https://www.gnu.org/software/pspp/get.html – Mark Setchell Apr 15 '15 at 15:47

3 Answers3

1

If you go through SPSS's menus for opening the first .csv file you should be able to paste the syntax to open the .csv file manually. On step 6 of 6 of the wizard it asks "Would you like to paste the syntax" select yes for this. This should give you the syntax to do this correctly. (I tried using the .csv file uploaded but because of the way the variables are filled in I can't be sure whether the variables should be string, numeric etc). Once you have this you can add syntax to save the open file as .sav. Then to convert each file into a .sav all you'll need to do is change the numbers.

SAVE OUTFILE='C:\filepath\84.sav'
/COMPRESSED.

There might be a way to run through the process automatically using a DO REPEAT loop, but this should serve as a starting point towards automation.

figurine
  • 746
  • 9
  • 22
1

You can iterate a set of syntax over large numbers of files specified by a wildcard or an explicit list by using the SPSSINC PROCESS FILES extension command. You write a syntax file that should be applied to each input. In that file you use the file handles or macros defined by PROCESS FILES to open a file. Then you run arbitrary syntax on it and, in your case, use the input macro to build an output file name and run a SAVE command.

PROCESS FILES appears on the Utilities menu as Process Data Files once the command is installed. It requires the Python Essentials and is part of the Essentials as of version 23. For V22, you can install it from the Utilities menu; for older versions you need to download it from the SPSS Community website (www.ibm.com/developerworks/spssdevcentral) > Downloads for SPSS Statistics > Extension Commands and install via Utilities.

JKP
  • 5,419
  • 13
  • 5
  • I am completely new to this forum so I never knew my question received so many replies. Thank you so much JKP and I hope you will excuse me for the late thanks! – TVV Oct 21 '15 at 11:07
1

I'd rock it with the python module and a range loop...this works for me asssuming each .csv file is named subject 1, subject 2, etc. and is in the exact same format. Also, replace the drive path with the correct one.

Begin Program.
import spss

for x in range (1, 1001):

   y = """GET DATA  /TYPE=TXT
     /FILE= 'C:\YOUR DRIVE PATH HERE\subject """ + str(x) + """.csv'
     /DELCASE=LINE
     /DELIMITERS=" ,"
     /QUALIFIER="'"
     /ARRANGEMENT=DELIMITED
     /FIRSTCASE=2
     /IMPORTCASE=ALL
     /VARIABLES=
     Age A3
     COL A4
     Clear A6
     CorrectAnswer A13
     Education A9
     Ethnicity A9
     Gender A6."""

   z = "save outfile = 'C:\YOUR DRIVE PATH HERE\subject " + str(x) + ".sav'."

   print y
   print z
   spss.Submit(y)
   spss.Submit(z)

End Program.

If you're new to python, make sure to watch the intended white space and include the rest of your variables, which I left out for space. If your getting an error message, comment out the spss.Submit() commands with an # (ex. #spss.Submit() ) and check the python print out for string errors. I hope that helps!

Tim Gottgetreu
  • 483
  • 1
  • 8
  • 21
  • I only read your reply now but I've tried to learn some Python in the mean time and this looks brilliant to me. Thank you Tim! – TVV Oct 21 '15 at 11:08
  • Awesome, thanks for the feedback! Python has totally changed how I work with SPSS, couldn't do anything without it now :) – Tim Gottgetreu Oct 21 '15 at 17:45