[R] removing NA from a data frame

Petr PIKAL petr.pikal at precheza.cz
Fri Jun 22 11:58:43 CEST 2012


Hi

both na.omit and complete cases works for me smoothly when NA is not a 
valid level in factor.

If this is the case, as it seems to be, you need reset your factor levels 
so that NA is not a valid level.

ex10s$dg <- factor( ex10s$dg )

both commands shall work than.

Regards
Petr


> 
> Removing rows with NAs, using na.omit(), doesn't seem to be working for 
me.
> 
> Dataset:
> 
> > str ( ex10s )
> 
> 'data.frame':   2189576 obs. of  5 variables:
> $ LOPNR  : int  58 58 58 58 64 64 64 64 64 64 ...
> $ DIAGNOS: Factor w/ 173 levels "F20","F200","F2000",..: 128 128 128 128 

> 105 105 105 160 105 105 ...
> $ X_DATE : int  20060821 20061207 20080102 20090904 20010327 20010925 
> 20020307 20021007 20021007 20030320 ...
> $ SOURCE : int  2 2 2 2 2 2 2 2 2 1 ...
> $ dg     : Factor w/ 7 levels "0","1","2","3",..: 6 6 6 6 5 5 5 6 5 5 
...
> 
> The only NAs are in the factor dg (put in by 'recode' from the car 
> library; I'm trying to eliminate cases with particular factor levels)
> 
> > table ( ex10s$dg )
> 
>       0       1       2       3       4       5      NA
>    2851  271501   63112   98425  335593 1257299  160795
> 
> So, I remove the rows with NAs, to a new dataframe ex10ss:
> 
> > ex10ss<-na.omit(ex10s)
> 
> Check all the NAs have been removed:
> 
> > table(ex10ss$dg)
> 
>       0       1       2       3       4       5      NA
>    2851  271501   63112   98425  335593 1257299  160795
> 
> > dim(ex10s)
> [1] 2189576       5
> > dim(ex10ss)
> [1] 2189576       5
> 
> Nothing seems to have changed. I want all the rows with NA in removed.
> 
> I am clearly doing something wrong.
> 
> The only alternative I could find is pretty similar:
> use <- complete.cases ( ex10 )
> ex10ss<-ex10s[use,]
> which leads to the same result.
> 
> 
> Stuart
> 
> 
> Dr Stuart John Leask DM FRCPsych MB Mchir
> Clinical Senior Lecturer and Honorary Consultant Pychiatrist
> Institute of Mental Health, Innovation Park
> Triumph Road, Nottingham, Notts. NG7 2TU. UK
> Tel. +44 115 82 30419 stuart.leask at nottingham.ac.uk<
> mailto:stuart.leask at nottingham.ac.uk>
> Google 'Dr Stuart Leask'
> 
> 
> This message and any attachment are intended solely for the addressee 
and 
> may contain confidential information. If you have received this message 
in
> error, please send it back to me, and immediately delete it.   Please do 

> not use, copy or disclose the information contained in this message or 
in 
> any attachment.  Any views or opinions expressed by the author of this 
> email do not necessarily reflect the views of the University of 
Nottingham.
> 
> This message has been checked for viruses but the contents of an 
attachment
> may still contain software viruses which could damage your computer 
system:
> you are advised to perform your own checks. Email communications with 
the
> University of Nottingham may be monitored as permitted by UK 
legislation.
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list