[R] read.spss, locale and encodings

Peter Dalgaard p.dalgaard at biostat.ku.dk
Wed Apr 8 15:03:06 CEST 2009


Hans Ekbrand wrote:
> I must be missing something obvious here:
> 
> According to the help page for read.spss, the reencode option is only
> active when R is run under a UTF-8 locale.

Not in my version:

reencode: logical: should character strings be re-encoded to the
           current locale.  The default, 'NA', means to do so in a UTF-8
           locale, only.  Alternatively character, specifying an
           encoding to assume.


> 
> read.spss can only import the SPSS file when run under a iso88591(5)
> locale, under a UTF-8 locale I get:
> 
> Error in read.spss("wo.sav") : error reading system-file header
> In addition: Warning message:
> In read.spss("wo.sav") :
>   wo.sav: position 143: Variable name begins with invalid character

So, does it help with reencode="Latin1"? Presumably this comes from 
assuming UTF-8 when it isn't.

> This is under Debian GNU/Linux, the stable release.
> 
> foreign is version 8.27

8.34 is used in the current prerelease. AFAIR, some issues with 
encodings were fixed recently.

-- 
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907




More information about the R-help mailing list