[R] strange answer when using 'aggregate()' with a formula

Chel Hee Lee chl948 at mail.usask.ca
Thu Jan 21 16:06:57 CET 2016


I appreciate your kind guidance!  I did not read the manual carefully 
(it's my fault).

Thank you so much, Prof. John Fox!

Chel Hee Lee

On 01/21/2016 12:52 AM, Fox, John wrote:
> Dear Chel Hee Lee,
>
> With the formula method, the default na.action is na.omit; thus,
>
>> aggregate(y~grp, data=tmp, function(x) sum(is.na(x)), na.action=na.pass)
>    grp y
> 1   2 1
> 2   3 0
>
> I hope this helps,
>   John
>
> -----------------------------
> John Fox, Professor
> McMaster University
> Hamilton, Ontario
> Canada L8S 4M4
> Web: socserv.mcmaster.ca/jfox
>
>
>> -----Original Message-----
>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Chel Hee Lee
>> Sent: January 21, 2016 5:08 AM
>> To: R-help at r-project.org
>> Subject: [R] strange answer when using 'aggregate()' with a formula
>>
>> Could you kindly test the following codes?  It is because I found strange answer
>> when 'aggregate()' is used with a formula.
>>
>> I am trying to count how many missing data entries are in each group.
>> For this exercise, I created data as below:
>>
>>   > tmp <- data.frame(grp=c(2,3,2,3), y=c(NA, 0.5, 3, 0.5))  > tmp
>>     grp   y
>> 1   2  NA
>> 2   3 0.5
>> 3   2 3.0
>> 4   3 0.5
>>
>> I see that observations (variable y) can be grouped into two groups (variable
>> grp).  For group 2, y has NA and 3.0.  For group 3, y has 0.5 and 0.5.  Hence, the
>> number of missing values is 1 and 0 for group 2 and
>> 3, respectively.   This work can be done using 'aggregate()' in the
>> 'stats' package as below:
>>
>>   > aggregate(x=tmp$y, by=list(grp=tmp$grp), function(x) sum(is.na(x)))
>>     grp x
>> 1   2 1
>> 2   3 0
>>
>> A formula can be used as below:
>>
>>   > aggregate(y~grp, data=tmp, function(x) sum(is.na(x)))
>>     grp y
>> 1   2 0
>> 2   3 0
>>
>> What a surprise!  Is this a bug?  I would appreciate if you share the
>> results after testing the codes.   Thank you so much for your helps in
>> advance!
>>
>> Chel Hee Lee
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list