1

On many micro-data catalog of household surveys ( for instance http://microdata.worldbank.org ...) , the data dictionary (i.e the code book) is actually described within a .sps or .sas syntax text file that follows a clear structure. The scripts includes mapping between questions & modalities labels and their name within the raw dataset.

See for instance any of the first down-loadable zip file below within any open record from the catalog: enter image description here

Is there an already available R function that would allow to parse the .sps syntax file (better than .sas as the questions label are fully preserved in the .sps...) in order to have a data frame that would allow to easily re-encode the dataset?

The closest i found is http://jason.bryer.org/posts/2013-01-10/Function_for_Reading_Codebooks_in_R.html but it's not working out of the box for an .sps file

There was as well an old discussion here : http://r.789695.n4.nabble.com/how-to-read-sps-SPSS-file-extension-td875309.html and here Input data into R from .dat and .sps files but no solution provided...

Thanks in advance!

user3148607
  • 191
  • 11

0 Answers0