[R] NAs and row/column calculations

David Winsemius dwinsemius at comcast.net
Fri Mar 12 05:44:36 CET 2010


On Mar 11, 2010, at 11:28 PM, David Winsemius wrote:

>
> On Mar 11, 2010, at 6:20 PM, Jim Bouldin wrote:
>
>>
>>>
>>> On 12/03/2010, at 11:25 AM, Jim Bouldin wrote:
>>>
>>>>
>>>> I continue to have great frustrations with NA values--in particular
>>> making
>>>> summary calculations on rows or cols of a matrix containing  
>>>> them.  For
>>>> example, why does:
>>>>
>>>>> a = matrix(1:30,nrow=5)
>>>>> is.na(a[c(1:2),c(3:4)]);a
>>>>   [,1] [,2] [,3] [,4] [,5] [,6]
>>>> [1,]    1    6   NA   NA   21   26
>>>> [2,]    2    7   NA   NA   22   27
>>>> [3,]    3    8   13   18   23   28
>>>> [4,]    4    9   14   19   24   29
>>>> [5,]    5   10   15   20   25   30
>>>>> apply(a[!is.na(a)],2,sum)
>>>>
>>>> give me this:
>>>>
>>>> "Error in apply(a[!is.na(a)], 2, sum) : dim(X) must have a positive
>>> length"
>>>>
>>>> when
>>>>
>>>>> dim(a)
>>>> [1] 5 6
>>>>
>>>> What is the trick to calculating summary values from rows or  
>>>> columns
>>>> containing NAs?  Drives me nuts.  More nuts that is.
>>>
>>> When you do a[!is.na(a)] you get a ***vector*** --- not a matrix.
>>> ``Obviously''!!!
>>
>> Well, obvious to you maybe, or someone who's done it before, but  
>> not to me.
>>
>> The non-missing values of a cannot be arranged in
>>> a 5 x 6 matrix; there are only 26 of them.  So (as my late Uncle
>>> Stanley would have said) ``What the hell do you expect?''.
>>
>> Silly me, I expected, based on (1) previous experience doing  
>> summary calcs
>> on subsets of a matrix using exactly that style of command, and (2)  
>> the
>> fact that dim(a) returns: [1] 5 6, and (3) the fact that a help  
>> search
>> under the "apply" function gives NO INDICATION of any possible use  
>> of the
>> na.rm command,
>
> Not really true. You may be at a stage where you are not paying  
> attention to what the , ...) arguments to functions are doing, so  
> you may have passed over the fact that it is described as "optional  
> arguments to FUN." Now in fairness to the apply help page authors it  
> would be impossible to list all of the possible optional arguments  
> because the range of possible functions is, while countable, still  
> extremely large. I think it would be useful to describe on that help  
> page a bit more about what restrictions may exist here and to  
> include an example that uses that facility, but I am not part of R  
> Core.
>
>
>> AND (4) a help search on "na.action" does not even mention
>> na.rm, that:
>>
>>> apply(a[!is.na(a)],2,sum)
>>
>> would sum the non-NA elements of matrix a, by columns.  Terribly  
>> faulty
>> reasoning on my part, obviously.
>
> What, may I inquire, happens when you look at the help page for  
> "sum"? While you are at it, you may want to acquaint yourself with  
> the "na.rm="  parameter in other functions, because it is also  
> essential for productive use of several other useful functions, like  
> median and density.

AS a further exercise you may want to follow this path. (I learned new  
bits.)  After getting annoyed that neither ""na.rm", nor ??"na.rm"  
provided any 'help',  I tried the sos package:

 > ??"na.rm"
No help files found matching ‘na.rm’ using regexp matching
 > library(sos)
Loading required package: brew

Attaching package: 'sos'


	The following object(s) are masked from package:utils :

	 ?

 > ???"na.rm"
found 476 matches;  retrieving 20 pages, 400 matches.
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

>
>>
>>
>>>
>>> The ``trick'' is to remove the NAs at the summing stage:
>>>
>>> apply(a,2,sum,na.rm=TRUE)
>>>
>>> Not all that tricky.
>>>
>>> 	cheers,
>>>
>>> 		Rolf Turner
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list