[R] read.spss and encodings

Prof Brian Ripley ripley at stats.ox.ac.uk
Sun Feb 4 18:53:04 CET 2007


Most of package 'foreign' was written to support only single-byte 
character sets.  Since CP1253 is not an encoding in use on Linux and your 
value labels are not valid in el_GR.iso88597 (I tried doing this in that 
locale: had you?), I think you are expecting far too much.

That R is unable to read binary files encoded in a charset not supported 
on your own system seems perfectly reasonable for any system, let alone a 
volunteer project.  You are very welcome to contribute a package to read 
such files, of course (and that people did is how package 'foreign' came 
into existence).

On Sun, 4 Feb 2007, I. Soumpasis wrote:

> HI!
>
> This mail is related to Thomas mail so I follow up.
>
> I use Greek language and the spss files with value labels containing greek
> characters can not be imported with read.spss.
>
> I am on:
>
>> sessionInfo()
> R version 2.5.0 Under development (unstable) (2007-02-01 r40632)
> i686-pc-linux-gnu
>
> locale:
> LC_CTYPE=el_GR.UTF-8;LC_NUMERIC=C;LC_TIME=el_GR.UTF-8;LC_COLLATE=el_GR.UTF-8;LC_MONETARY=el_GR.UTF-8;LC_MESSAGES=el_GR.UTF-8;LC_PAPER=el_GR.UTF-8;LC_NAME=el_GR.UTF-8;LC_ADDRESS=el_GR.UTF-8;LC_TELEPHONE=el_GR.UTF-8;LC_MEASUREMENT=el_GR.UTF-8;LC_IDENTIFICATION=el_GR.UTF-8
>
>
> The following files are small examples used below:
> http://users.forthnet.gr/the/isoumpasis/data/1.sav
> http://users.forthnet.gr/the/isoumpasis/data/12.sav<http://users.forthnet.gr/the/isoumpasis/data/12.RData>
>
> The first file has english value labels and can be read:
>> read.spss("~/Desktop/1.sav")
> $VAR1
> [1] "\xf3\xf0\xdf\xf4\xe9     "       "\xf3\xf0\xdf\xf4\xe9     "
> [3] "\xf3\xf0\xdf\xf4\xe9     "       "\xf3\xf0\xdf\xf4\xe9     "
> [5] "\xf3\xf0\xdf\xf4\xe9     "       "\xe3\xf1\xe1\xf6\xe5\xdf\xef   "
> [7] "\xe3\xf1\xe1\xf6\xe5\xdf\xef   " "\xe3\xf1\xe1\xf6\xe5\xdf\xef   "
> [9] "\xe3\xf1\xe1\xf6\xe5\xdf\xef   " "\xf3\xf0\xdf\xf4\xe9     "
> [11] "\xe3\xf1\xe1\xf6\xe5\xdf\xef   "
>
> $VAR2
> [1] 5 6 7 7 5 7 3 5 6 7 8
>
> attr(,"label.table ")
> attr(,"label.table")$VAR1
> NULL
>
> attr(,"label.table")$VAR2
> NULL
>
> I can then convert the characters to greek using Thomas' code, so there is
> no problem here.
>
> In file 12.sav the value labels are greek. The problem is that the file
> cannot be read.
>
>> read.spss("~/Desktop/12.sav")
> Error in read.spss("~/Desktop/12.sav") : error reading system-file header
> In addition: Warning message:
> ~/Desktop/12.sav: position 0: Variable name begins with invalid character
>
> I also tried using use.value.labels=FALSE having the same message.
>
>> read.spss("~/Desktop/12.sav", use.value.labels=FALSE)
> Error in read.spss("~/Desktop/12.sav", use.value.labels = FALSE) :
>    error reading system-file header
> In addition: Warning message:
> ~/Desktop/12.sav: position 0: Variable name begins with invalid character
>
> The encoding of the spss files is windows-1253 (greek). The problem should
> be with other non-ascii characters too. Is there any workaround for this?
>
> Thanks in advance
> I.Soumpasis
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list