[Rd] re ad.spss (foreign) conflict with SPSS 17 files.

Peter Dalgaard p.dalgaard at biostat.ku.dk
Mon Dec 15 02:06:39 CET 2008


Jeroen Ooms wrote:
> SPSS seems to have changed its default datafile format, resulting in issues
> for read.spss(). In Windows this results in a warning, in Debian the import
> completely fails:
> 
> Debian (R version 2.8.0 (2008-10-20) i486-pc-linux-gnu, foreign_0.8-29)
> 
>> read.spss("/home/jeroen/samples/Tomato.sav")
> Error in iconv(names(rval), cp, "") :
>   unsupported conversion from 'CP65001' to ''
> In addition: Warning messages:
> 1: In read.spss("/home/jeroen/samples/Tomato.sav") :
>   /home/jeroen/samples/Tomato.sav: File-indicated character representation
> code (65001) looks like a Windows codepage
> 2: In read.spss("/home/jeroen/samples/Tomato.sav") :
>   /home/jeroen/samples/Tomato.sav: Unrecognized record type 7, subtype 20
> encountered in system file
> 
> 
> windows (R version 2.8.0 (2008-10-20), foreign_0.8-29)
> 
>> read.spss("C:/Program
>> Files/SPSSInc/Statistics17/Samples/English/Tomato.sav")
> 
> ...
>  
> attr(,"codepage")
> [1] 65001
> 
> Warning messages:
> 1: In read.spss("C:/Program
> Files/SPSSInc/Statistics17/Samples/English/Tomato.sav") :
>   C:/Program Files/SPSSInc/Statistics17/Samples/English/Tomato.sav:
> File-indicated character representation code (65001) looks like a Windows
> codepage
> 2: In read.spss("C:/Program
> Files/SPSSInc/Statistics17/Samples/English/Tomato.sav") :
>   C:/Program Files/SPSSInc/Statistics17/Samples/English/Tomato.sav:
> Unrecognized record type 7, subtype 20 encountered in system file
> 
> 
> I've share some sample datafiles that are included with SPSS, so you can
> take a look: http://jeroen.xlshosting.net/samples/
> I hope there is a fix, I think importing data from SPSS is a very popular
> feature. 
> 
> Thank you!


Thanks,

It looks like adding reencode="utf8" removes the iconv message. The 
warnings appear to be harmless.

In fact, reencode="ascii" works for me as well on the Tomato.sav file. 
However as far as I can google, Code Page 65001 _is_ UTF-8...

-- 
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907



More information about the R-devel mailing list