3

Lately, I sometimes get an error when reading SPSS files using read.spss from the foreign package:

Error in read.spss("sample.sav") : error reading system-file header In addition: Warning message: In read.spss("sample.sav") : sample.sav: Bad format specifier byte (0)

I produced a tiny sample.sav file with just one variable and 3 cases that will cause the error. Download the file or use

download.file("http://134.102.100.220/~mark/sample.sav", "sample.sav")
read.spss("sample.sav")

Any ideas?

My system

R version 3.1.1 (2014-07-10)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
foreign: Version 0.8-63

locale:
[1] en_US.UTF-8/de_DE.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
eli-k
  • 10,898
  • 11
  • 40
  • 44
Mark Heckmann
  • 10,943
  • 4
  • 56
  • 88
  • Just throwing it out there, but have you tried the [haven](https://github.com/hadley/haven) package as an alternative? – JasonAizkalns Mar 24 '15 at 14:43
  • @JasonAizkalns `haven::read_sav("sample.sav")` also fails – Mark Heckmann Mar 24 '15 at 14:54
  • 1
    I tried both the haven package and `read_spss` from [sjmisc](http://cran.r-project.org/web/packages/sjmisc/index.html) (former sjPlot-tool functions) and with both packages I could read your sample file w/o errors or warnings. – Daniel Mar 26 '15 at 07:34
  • @DanielLüdecke. yes. I was wrong there. `read_spss` does the job (also see my comment in the `haven`answer). – Mark Heckmann Mar 26 '15 at 20:54

2 Answers2

4

I would use the haven package, rather than foreign, to read spss files:

require("haven")
sample <- read_spss("sample.sav")
View(sample)

You could alternatively use the sjPlot package, which uses haven to do its heavy lifting:

require("sjPlot")
sample <- sjPlot::read_spss("sample.sav", option = "haven")
View(sample)

Using sjPlot, you can also view the variable labels and values:

sjPlot::view_spss(sample)
Phil
  • 4,344
  • 2
  • 23
  • 33
  • 1
    An update to an old question - `read_spss` is no longer in `sjPlots` but in `sjlabelled` – DJV Aug 07 '19 at 21:27
3

$FL2@(#) IBM SPSS STATISTICS DATA FILE 64-bit Macintosh 20.0.0 ����������������������Y@24 Mar 1515:00:55electric paper �������������������VAR1 ���None�������������������������������–�����������È˝��������������ˇˇˇˇˇˇÔˇˇˇˇˇˇˇÔ˛ˇˇˇˇˇÔˇ���

That is the header viewed in a simple text editor (TextEdit.app). So reading the help file for read.spss, one sees that it suggested using:

install.packages("memisc")

?memisc::spss.system.file
 memisc::spss.system.file("~/Downloads/sample.sav")
#-=----------------
SPSS system file '/Users/davidwinsemius/Downloads/sample.sav' 
    with 1 variables and 3 observations
 inp <- memisc::spss.system.file("~/Downloads/sample.sav")

 actual <- memisc::subset(inp, select= c(var1=var1))
 actual

Data set with 3 observations and 1 variables

  var1
1    1
2    2
3    3

The moral of the story: Sometimes is is better to read all of the help file. Since I have in the past read that same help page, I was surprised to find that it had been modified. In the past there comments regarding version limitations which now seem to have been removed.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • David, I tried to RTM for quite some time now, but cannot figure out what the `memisc` equivalent for `use.value.labels = FALSE` in `foreign::read.spss` is, when my goal is to have a dataframe. Can you help? – Mark Heckmann Mar 24 '15 at 22:30
  • An S4 object is returned. The `.Data` slot in the 'data.set'-classed object holds the values and there are 'row_names' and 'names' attributes. Seems that `as.data.frame` succeeds in returning an ordinary dataframe in the small test case you provided. Since I see no "value.labels", I'm not sure what problems you might be facing. – IRTFM Mar 24 '15 at 23:30
  • You are right, the comment was a general question... I add value labels to a new file `sample_labels.sav`. Running the code: `download.file("http://134.102.100.220/~mark/sample_labels.sav", "sample_labels.sav"); x <- spss.system.file("sample_labels.sav"); xx <- as.data.set(x); as.data.frame(xx);` I get a dataframe that cointains value labels, not the raw values. I would like the raw values though... – Mark Heckmann Mar 25 '15 at 08:07