[R] evaluating NAs in a dataframe

Daniel Nordlund djnordlund at frontier.com
Wed Dec 8 21:54:12 CET 2010


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
> On Behalf Of Wade Wall
> Sent: Wednesday, December 08, 2010 12:11 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] evaluating NAs in a dataframe
> 
> Hi all,
> 
> How can one evaluate NAs in a numeric dataframe column?  For example, I
> have
> a dataframe (demo) with a column of numbers and several NAs. If I write
> demo.df >= 10, numerals will return TRUE or FALSE, but if the value is
> "NA", "NA" is returned.  But if I write demo.df == "NA", it returns as
> "NA"
> also.  I know that I can remove NAs, but would like to keep the dataframe
> as
> is without creating a subset.  I basically want to add a line that
> evaluates
> the NA in the demo dataframe.
> 
> As an example, I want to assign rows to classes based on values in
> demo$Area. Some of the values in demo$Area are "NA"
> 
> for (i in 1:nrow(demo)) {
>   if (demo$Area[i] > 0 && demo$Area[i] < 10) {Class[i]<-"S01"} ## 1-10 cm2
>   if (demo$Area[i] >= 10 && demo$Area[i] < 25) {Class[i] <- "S02"} ##
> 10-25cm2
>   if (demo$Area[i] >= 25 && demo$Area[i] < 50) {Class[i] <-"S03"} ## 25-50
> cm2
>   if (demo$Area[i] >= 50 && demo$Area[i] < 100) {Class[i] <-"S04"} ## 50-
> 100
> cm2
>   if (demo$Area[i] >= 100 && demo$Area[i] < 200) {Class[i] <- "S05"} ##
> 100-200 cm2
>   if (demo$Area[i] >= 200 && demo$Area[i] < 400) {Class[i] <- "S06"} ##
> 200-400 cm2
>   if (demo$Area[i] >= 400 && demo$Area[i] < 800) {Class[i] <- "S07"} ##
> 400-800 cm2
>   if (demo$Area[i] >= 800 && demo$Area[i] < 1600) {Class[i] <- "S08"} ##
> 800-1600 cm2
>   if (demo$Area[i] >= 1600 && demo$Area[i] < 3200) {Class[i] <- "S09"} ##
> 1600-3200 cm2
>   if (demo$Area[i] >=3200) {Class[i] <- "S10"} ## >3200 cm2
>   }
> 
> What happens is that I get the message "Error in if (demo$Area[i] > 0 &&
> demo$Area[i] < 10) { : missing value where TRUE/FALSE needed"
> 
> Thanks for any help
> 
> Wade
> 

Wade,

As you have discovered, you need to test for NA first, and to do that you need to use is.na().  Something like this should work

for (i in 1:nrow(demo)) {
  if (is.na(demo$Area[i])) Class[i] <- "Sna" else
  if (demo$Area[i] < 10) Class[i] <- "S01"   else 
  if (demo$Area[i] < 25) Class[i] <- "S02"   else
  if (demo$Area[i] < 50) Class[i] <- "S03"   else 
  if (demo$Area[i] < 100) Class[i] <- "S04"  else 
  if (demo$Area[i] < 200) Class[i] <- "S05"  else 
  if (demo$Area[i] < 400) Class[i] <- "S06"  else 
  if (demo$Area[i] < 800) Class[i] <- "S07"  else 
  if (demo$Area[i] < 1600) Class[i] <- "S08" else 
  if (demo$Area[i] < 3200) Class[i] <- "S09" else 
  Class[i] <- "S10" 
  }

Hope this is helpful,

Dan

Daniel Nordlund
Bothell, WA USA



More information about the R-help mailing list