[R] issues with read.spss

Jeroen Ooms jeroen.ooms at stat.ucla.edu
Sat Feb 11 03:12:03 CET 2012


Someone supplied me with an SPSS datafile that caused a buffer
overflow and then a crash when reading it in R. Unfortunately I can't
supply the dataset at hand and I have a hard time reproducing it with
a toy example. But I found at least 2 issues that might be related. I
would like to know which of these are expected behavior, and which are
bugs. I reproduced it on R 2.14.1 both on Ubuntu Linux and Windows
7...

Below some code. The files that are referenced in the code are
available for download on http://www.stat.ucla.edu/~jeroen/spss/

#load library
library(foreign)

#problem one: long string variable is converted to multiple variables.
x <- read.spss("longstring.sav");
summary(x); #4 variables??

#problem two: use.labels does not deal correctly with duplicate labels
and generates a bad factor.
x <- read.spss("duplicate_labels.sav", use.value.labels=T);



More information about the R-help mailing list