[R] read.spss: option "to.data.frame" and string variables

RINNER Heinrich HEINRICH.RINNER at tirol.gv.at
Tue Jan 12 11:28:26 CET 2010


Dear R-users,

I am using R version 2.10.1 and package foreign version 0.8-39 under windows.

When reading .sav-Files (PASW Statistics 18.0.1) containing string variables, these are automatically converted to factors when using option "to.data.frame = TRUE" (see example below).
It's clear to me why this happens (the default behaviour of a call to as.data.frame). But this is not always what one might want (or even be aware of).

So maybe one of the following improvements could be made?
* Add a description of this behaviour in ?read.spss.
* Or (even better): Add an extra argument, like: read.spss("C:\\temp\\test.sav", to.data.frame = TRUE, stringsAsFactors = FALSE).

Just a suggestion;
kind regards
Heinrich.

# EXAMPLE:
Suppose there is a simple file "test.sav", containing one variable ("x") of type STRING with 3 values (a,b,c).
> library(foreign)
> test <- read.spss("C:\\temp\\test.sav")
> test
$x
[1] "a       " "b       " "c       "

attr(,"label.table")
attr(,"label.table")$x
NULL

attr(,"codepage")
[1] 1252
> is.factor(test$x)
[1] FALSE
> is.character(test$x)
[1] TRUE
# Ok, that's just fine. But things change when using option "to.data.frame = TRUE":
> test <- read.spss("C:\\temp\\test.sav", to.data.frame = TRUE)
> test
         x
1 a
2 b
3 c
> is.factor(test$x)
[1] TRUE
> is.character(test$x)
[1] FALSE



More information about the R-help mailing list