[R] How can I import user-defined missings from Spss?

Christine Christmann christinechristmann at web.de
Tue Apr 15 22:59:09 CEST 2008


Ok, if I have to make the spss file available, then I hope an attachment is fine.

I really would appreciate if any 'kind soul' would give me a push into the right direction to solve this problem.

The spss file contains only two variables and five cases. For one case both values are defined as missings.
In R all cases are valid. Any information about missings is lost. What can I do to keep any missing information?

Cheers Christine

*-----------------------------------------------------------------------------------------.
> >
> >
> > to import the Spss Data in R. Via Hmisc or foreign - both work fine.
> >
> > #import Spssdata in R
> > spssfile <- "PathToTheSavedSpssFile"
> >
> > #via Hmisc
> > library(Hmisc)
> > Signs <- c("_")
> > mydata1 <- spss.get(spssfile,lowernames=TRUE, allow=Signs)
> >
> > #via foreign
> > library(foreign)
> > mydata2 <- read.spss(spssfile,use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE)
> >
> > #freq in r
> > describe(mydata1)
> > describe(mydata2)


> -----Ursprüngliche Nachricht-----
> Von: "Prof Brian Ripley" <ripley at stats.ox.ac.uk>
> Gesendet: 15.04.08 12:13:45
> An: Christine Christmann <christinechristmann at web.de>
> CC: r-help at r-project.org
> Betreff: Re: [R] How can I import user-defined missings from Spss?


> 
> You have already had a reply to a version of this (posted from another 
> address) at https://stat.ethz.ch/pipermail/r-help/2008-April/159342.html . 
> 'Kind souls' are likely to get exasperated when their help is 
> unacknowledged.
> 
> You need SPSS and Windows to reproduce this, and this is the R forum.  To 
> fulfil the footer of the message you need to make available the spss save 
> file.
> 
> On Tue, 15 Apr 2008, Christine Christmann wrote:
> 
> > Hi,
> >
> > It works for me to import spss datasets via library(foreign) with read.spss or via library Hmisc by (spss.get).
> > But no matter which way I do import the data, user-defined missings from Spss are always lost.
> > (it makes no difference if  there are a single value, a range, or any combination of them. They are always ignored).
> > Is there any way in R to find out if any value was user-defined missing in Spss or not?
> > Even to keep the information as an attribute would suit me fine, or to keep them as a string character like "miss" would be even better.
> > To transform them into "NA" as the sysmis data from Spss is transformed automatically, would be an other alternative.
> >
> > Unfortunately I don't know if any of these options are possible. Could you help me out?
> >
> > Let me give you an example:
> > Preconditions: You need to have spss on you computer to generate the spss data.
> > You need to generate the folder C:/tmp to save the spss file. As you can see I work with windows.
> >
> > */1) Generate the SpssData:
> > */data.
> > DATA LIST LIST /age (f2) sport (f2).
> > BEGIN DATA
> > 22, 1
> > 40, 2
> > 69, 1
> > 19, 2
> > -99, 9
> > END DATA.
> >
> >
> > */description.
> > missing values age (LO thru 0).
> > missing values sport (9).
> > var label age "age".
> > var label sport "Do you like sports"
> > value label sport
> > 1 "yes"
> > 2 "no"
> > 3 "don't know".
> >
> > *frequencies in Spss.
> > freq age sport.
> >
> >
> > save outfile = "C:\tmp\test.sav".
> > *-----------------------------------------------------------------------------------------.
> >
> >
> > 2) Import the Spss Data in R. Via Hmisc or foreign - both work fine.
> >
> > #import Spssdata in R
> > spssfile <- "C:/tmp/test.sav"
> >
> > #via Hmisc
> > library(Hmisc)
> > Signs <- c("_")
> > mydata1 <- spss.get(spssfile,lowernames=TRUE, allow=Signs)
> >
> > #via foreign
> > library(foreign)
> > mydata2 <- read.spss(spssfile,use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE)
> >
> > #freq in r
> > describe(mydata1)
> > describe(mydata2)
> >
> >
> > *-----------------------------------------------------------------------------------------.
> > Have a look at the two variables age and sport. In spss the values (-99) in age is a missing, as well as the value (9) in sports.
> > As you can see - the information about the missings in R is lost. What can I do?
> >
> >
> > Many Thanks Christine Christmann
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> 




More information about the R-help mailing list