[R] deleting specified NA values

(Ted Harding) Ted.Harding at nessie.mcc.ac.uk
Mon Nov 1 16:03:07 CET 2004


On 01-Nov-04 Robert Brown FM CEFAS wrote:
> I have a data set of about 10000 records which was compiled from
> several smaller data sets using SPSS. During compilation 88 false
> records were accidentally introduced which comprise all NA values.  I
> want to delete these records but not other missing data.  The functions
> na.exclude and na.omit seem to remove all values of NA? How can I
> delete just the relevant NA's?  . i.e. I want to delete  all records in
> the data frame DATA where the field age contains NA values

Hi Robert,
It's not quite clear what your "NA" criterion for deletion really is.

If (as you state first) the false records "comprise all NA values",
this suggests that in such a record every field is "NA".

On the other hand you say you "want to delete  all records in
the data frame DATA where the field age contains NA values", so
it looks as though you can check for deletion on the field "age"
only.

Suppose your dataframe is called DF.

In the second case, which is simpler, you can simply do

  newDF <- DF[!is.na(DF$age),]

In the first case, it's fundamentally the same but you have to
run the check along every element in each row. So define a function

  notallna<-function(x){!all(is.na(x))}

and then

  newDF <- DF[apply(DF,1,notallna),]

This will leave in every record in which not all fields are"NA",
so will include records in which only some fields are "NA".

Hoping this helps,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861  [NB: New number!]
Date: 01-Nov-04                                       Time: 16:03:07
------------------------------ XFMail ------------------------------




More information about the R-help mailing list