[R] Adding two or more columns of a data frame for each row when NAs are present.

David Winsemius dwinsemius at comcast.net
Mon Nov 21 06:03:21 CET 2011


On Nov 20, 2011, at 3:38 PM, Ian Strang wrote:

>
> I am fairly new to R and would like help with the problem below. I  
> am trying to sum and count several rows in the data frame yy below.  
> All works well as in example 1. When I try to add the columns, with  
> an NA in Q21, I get as NA as mySum. I would like NA to be treated as  
> O, or igored.

"Ignored" is by far the better option and that is easily accomplished  
by reading the help page for 'sum' and using the obvious parameter  
settings.

?sum


> I wrote a function to try to count an NA element as 0, Example 3  
> function. It works with a few warnings, Example 4, but still gives  
> NA instead of the addition when there is an NA in an element.
>
> In Example 6 & 7, I tried using sum() but it just sums the whole  
> data frame, I think,

It sums whatever you give it.

>
> How do I add together several columns giving the result for each row  
> in mySum?

?rowSums  # which also has the same parameter setting for dealing with  
NAs.


> NA should be treated as a 0.

Nooo , noooo,  nooooooooo. If it's missing it's not 0.

> Please, note, I do not want to sum all the columns, as I think  
> rowSums would do, just the selected ones.

Fine. then select them:

?["

-- 
David.
>
> Thanks for your help.
> Ian,
>
> > yy <- read.table( header = T, sep=",", text =     ## to create a  
> data frame
> + "Q20, Q21, Q22, Q23, Q24
> +  0,1, 2,3,4
> +  1,NA,2,3,4
> +  2,1, 2,3,4")
> +  yy
>  Q20 Q21 Q22 Q23 Q24
> 1   0   1    2   3   4
> 2   1  NA   2   3   4
> 3   2   1    2   3   4
>
> > x <- transform( yy,     ############## Example 1
> +   mySum = as.numeric(Q20) + as.numeric(Q22) + as.numeric(Q24),
> +   myCount = as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21)) 
> +as.numeric(!is.na(Q24))
> + )
> + x
>  Q20 Q21 Q22 Q23 Q24 mySum myCount
> 1   0   1    2   3   4     6       3
> 2   1  NA   2   3   4     7       2
> 3   2   1    2   3   4     8       3
> >
> + x <- transform( yy,     ################ Example 2
> +   mySum = as.numeric(Q20) + as.numeric(Q21) + as.numeric(Q24),
> +   myCount = as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21)) 
> +as.numeric(!is.na(Q24))
> + )
> + x
>  Q20 Q21 Q22 Q23 Q24 mySum myCount
> 1   0   1    2   3   4     5       3
> 2   1  NA   2   3   4    NA       2
> 3   2   1    2   3   4     7       3
>
> > NifAvail <- function(x) { if (is.na(x)) x<-0 else x <- x    
> ############### Example 3
> +   return(as.numeric(x))
> + } #end function
> + NifAvail(5)
> [1] 5
> + NifAvail(NA)
> [1] 0
>
> > x <- transform( yy,
> +   mySum = NifAvail(Q20) + NifAvail(Q22) + NifAvail(Q24),     
> ############### Example 4
> +   myCount = as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21)) 
> +as.numeric(!is.na(Q24))
> + )
> Warning messages:
> 1: In if (is.na(x)) x <- 0 else x <- x :
>  the condition has length > 1 and only the first element will be used
> 2: In if (is.na(x)) x <- 0 else x <- x :
>  the condition has length > 1 and only the first element will be used
> 3: In if (is.na(x)) x <- 0 else x <- x :
>  the condition has length > 1 and only the first element will be used
> > x
>  Q20 Q21 Q22 Q23 Q24 mySum myCount
> 1   0   1    2   3   4     6       3
> 2   1  NA   2   3   4     7       2
> 3   2   1    2   3   4     8       3
> > x <- transform( yy,
> +   mySum = NifAvail(Q20) + NifAvail(Q21) + NifAvail(Q24),      
> ################ Example 5
> +   myCount = as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21)) 
> +as.numeric(!is.na(Q24))
> + )
> Warning messages:
> 1: In if (is.na(x)) x <- 0 else x <- x :
>  the condition has length > 1 and only the first element will be used
> 2: In if (is.na(x)) x <- 0 else x <- x :
>  the condition has length > 1 and only the first element will be used
> 3: In if (is.na(x)) x <- 0 else x <- x :
>  the condition has length > 1 and only the first element will be used
> > x
>  Q20 Q21 Q22 Q23 Q24 mySum myCount
> 1   0   1    2   3   4     5       3
> 2   1  NA   2   3   4    NA       2
> 3   2   1    2   3   4     7       3
>
>
> > x <- transform( yy,                                         
> ############ Example 6
> +   mySum = sum(as.numeric(Q20), as.numeric(Q21), as.numeric(Q23),  
> na.rm=T),
> +   myCount = as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21)) 
> +as.numeric(!is.na(Q24))
> + )
> + x
>  Q20 Q21 Q22 Q23 Q24 mySum myCount
> 1   0   1    2   3   4    14       3
> 2   1  NA   2   3   4    14       2
> 3   2   1    2   3   4    14       3
>
> > x <- transform( yy,                                        
> ############# Example 7
> +   mySum = sum(as.numeric(Q20), as.numeric(Q22), as.numeric(Q23),  
> na.rm=T),
> +   myCount = as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21)) 
> +as.numeric(!is.na(Q24))
> + )
> + x
>  Q20 Q21 Q22 Q23 Q24 mySum myCount
> 1   0   1    2   3   4    18       3
> 2   1  NA   2   3   4    18       2
> 3   2   1    2   3   4    18       3
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list