[R] "apply" question

Gabor Grothendieck ggrothendieck at gmail.com
Mon May 2 17:19:59 CEST 2005


On 5/2/05, Christoph Scherber <Christoph.Scherber at uni-jena.de> wrote:
> Dear R users,
> 
> I´ve got a simple question but somehow I can´t find the solution:
> 
> I have a data frame with columns 1-5 containing one set of integer
> values, and columns 6-10 containing another set of integer values.
> Columns 6-10 contain NA´s at some places.
> 
> I now want to calculate
> (1) the number of values in each row of columns 6-10 that were NA´s

Supposing our data is called DF,

rowSums(!is.na(DF[,6:10]))

> (2) the sum of all values on columns 1-5 for which there were no missing
> values in the corresponding cells of columns 6-10.

In the expression below 1 + 0 *DF[,6:10] is like DF[,6:10] except
all non-NAs are replaced by 1.  Multiplying DF[,1:5] by that
effectively replaces each element in DF[,1:5] with an NA if
the corresponding DF[,6:10] contained an NA.

rowSums( DF[,1:5] * (1 + 0 * DF[,6:10]), na.rm = TRUE )

> 
> Example: (let´s call the data frame "data")
> 
> Col1   Col2   Col3   Col4   Col5   Col6   Col7   Col8   Col9   Col10
> 1      2      5      2      3      NA      5      NA    1      4
> 3      1      4      5      2      6      NA      4     NA     1
> 
> The result would then be (for the first row)
> (1) "There were 2 NA´s in columns 6-10."
> (2) The mean of Columns 1-5 was 2+2+3=7" (because there were NA´s in the
> 1st and 3rd position in rows 6-10)

I guess you meant sum when you referred to mean in (2).  If you really
do want the mean replace rowSums with rowMeans in the expression
given above in the answer to (2).




More information about the R-help mailing list