[R] "Types" of missingness

Gabor Grothendieck ggrothendieck at gmail.com
Sun Feb 28 13:23:05 CET 2010


Check out:
https://stat.ethz.ch/pipermail/r-help/2010-February/228228.html

On Sun, Feb 28, 2010 at 2:39 AM, Christian Raschke
<crasch2 at tigers.lsu.edu> wrote:
> Dear R-List,
>
> My questions concerns missing values. Specifically, is is possible to
> use different "types" of missingness in a dataset and not a
> one-size-fits-all NA?
> For example, data may be missing because of an outright refusal by a
> respondent to answer a question, or because she didn't know an answer,
> or because the item simply did not apply. In later analysis it is
> sometimes useful to be able to distinguish between the cases, but
> nonetheless have them all treated as missing when using, say, lm( ).
> In Stata this is possible by using different missing value indicators.
> The standard one is a period '.' whereas '.a' and '.b' etc are treated
> as missing too, but can all be distinguished from another (they are even
> ordinal such that . < .a < .b).
> To give a simplistic example in R, let
>
>  > dat <- data.frame(
> + hours = c(36, 40, 40, 0, 37.5, 0, 36, 20, 40),
> + wage = c( 15.5, 7.5, 8, -1, 17.5, -1, -2, 13, -2))
>  > dat
>   hours wage
> 1  36.0 15.5
> 2  40.0  7.5
> 3  40.0  8.0
> 4   0.0 -1.0
> 5  37.5 17.5
> 6   0.0 -1.0
> 7  36.0 -2.0
> 8  20.0 13.0
> 9  40.0 -2.0
>
>
> where for wages -1 indicates "didn't work" and -2 indicates "refused to
> respond". How could I replace the negative values for wages with
> missingness indicators to use the data frame in for instance lm( ), but
> later operate only on those observations who "refused to respond"?
> Of course I can always work around this somehow, especially in this easy
> example, but as data frames get larger and cases more complex the
> workarounds seem more and more klutzy to me.
> So, if there is an easy way to do this that I have overlooked, I would
> be grateful for any advice or references.
>
> Best,
> Christian
>
> --
> Christian Raschke
> Department of Economics
> and
> ISDS Research Lab (HSRG)
> Louisiana State University
> Patrick Taylor Hall, Rm 2128
> Baton Rouge, LA 70803
> crasch2 at lsu.edu
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list