[R] Replacing NAs in a data frame using is.na() fails if there are no NAs

Fri Jan 14 12:56:45 CET 2005

On Fri, 14 Jan 2005, michael watson (IAH-C) wrote:

> Hi
>
> This is a difference between the way matrices and data frames work I
> guess.  I want to replace the NA values in a data frame by 0, and the
> code works as long as the data frame in question actually includes an NA
> value.  If it doesn't, there is an error:
>
> df <- data.frame(c1=c(1,1,1),c2=c(2,2,NA))
> df[is.na(df)] <- 0
> df
>
> df <- data.frame(c1=c(1,1,1),c2=c(2,2,2))
> df[is.na(df)] <- 0
> Df
>
> Any help would be appreciated.  I could just convert the data frame to a
> matrix, execute the code, then convert it back to a data frame, but that
> appears long winded.

As always, look at the objects:

> is.na(df)
      c1    c2
1 FALSE FALSE
2 FALSE FALSE
3 FALSE FALSE

so there is nothing to replace by 0.

What you should have is

ind <- is.na(df)
df[ind] <- rep(0, sum(ind))

to give the right number of replacements.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595