[R] NAs and row/column calculations

David Winsemius dwinsemius at comcast.net
Fri Mar 12 05:28:11 CET 2010


On Mar 11, 2010, at 6:20 PM, Jim Bouldin wrote:

>
>>
>> On 12/03/2010, at 11:25 AM, Jim Bouldin wrote:
>>
>>>
>>> I continue to have great frustrations with NA values--in particular
>> making
>>> summary calculations on rows or cols of a matrix containing them.   
>>> For
>>> example, why does:
>>>
>>>> a = matrix(1:30,nrow=5)
>>>> is.na(a[c(1:2),c(3:4)]);a
>>>    [,1] [,2] [,3] [,4] [,5] [,6]
>>> [1,]    1    6   NA   NA   21   26
>>> [2,]    2    7   NA   NA   22   27
>>> [3,]    3    8   13   18   23   28
>>> [4,]    4    9   14   19   24   29
>>> [5,]    5   10   15   20   25   30
>>>> apply(a[!is.na(a)],2,sum)
>>>
>>> give me this:
>>>
>>> "Error in apply(a[!is.na(a)], 2, sum) : dim(X) must have a positive
>> length"
>>>
>>> when
>>>
>>>> dim(a)
>>> [1] 5 6
>>>
>>> What is the trick to calculating summary values from rows or columns
>>> containing NAs?  Drives me nuts.  More nuts that is.
>>
>> When you do a[!is.na(a)] you get a ***vector*** --- not a matrix.
>> ``Obviously''!!!
>
> Well, obvious to you maybe, or someone who's done it before, but not  
> to me.
>
> The non-missing values of a cannot be arranged in
>> a 5 x 6 matrix; there are only 26 of them.  So (as my late Uncle
>> Stanley would have said) ``What the hell do you expect?''.
>
> Silly me, I expected, based on (1) previous experience doing summary  
> calcs
> on subsets of a matrix using exactly that style of command, and (2)  
> the
> fact that dim(a) returns: [1] 5 6, and (3) the fact that a help search
> under the "apply" function gives NO INDICATION of any possible use  
> of the
> na.rm command,

Not really true. You may be at a stage where you are not paying  
attention to what the , ...) arguments to functions are doing, so you  
may have passed over the fact that it is described as "optional  
arguments to FUN." Now in fairness to the apply help page authors it  
would be impossible to list all of the possible optional arguments  
because the range of possible functions is, while countable, still  
extremely large. I think it would be useful to describe on that help  
page a bit more about what restrictions may exist here and to include  
an example that uses that facility, but I am not part of R Core.


> AND (4) a help search on "na.action" does not even mention
> na.rm, that:
>
>> apply(a[!is.na(a)],2,sum)
>
> would sum the non-NA elements of matrix a, by columns.  Terribly  
> faulty
> reasoning on my part, obviously.

What, may I inquire, happens when you look at the help page for "sum"?  
While you are at it, you may want to acquaint yourself with the  
"na.rm="  parameter in other functions, because it is also essential  
for productive use of several other usueful functions, like median and  
density.

>
>
>>
>> The ``trick'' is to remove the NAs at the summing stage:
>>
>> apply(a,2,sum,na.rm=TRUE)
>>
>> Not all that tricky.
>>
>> 	cheers,
>>
>> 		Rolf Turner

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list