[R] Aggregrate function

David Winsemius dwinsemius at comcast.net
Fri Feb 13 06:13:08 CET 2009


I realized later that the which might not be necessary (and in  
addition was reminded privately). The %in% function returns a logical  
vector which works just as well with matrix or dataframe indexing as  
the numeric vector returned by which.

-- 
David Winsemius


On Feb 12, 2009, at 5:52 PM, David Winsemius wrote:

> aggregate and by are convenience functions of tapply.
>
> Consider this alternate solution:
>
> xveg[which(xveg$tot %in% with(xveg, tapply(tot, loc, max))),"sp"]
>
> It uses tapply to find the maximums by loc(ations) and then to goes
> back into xveg to find the corresponding sp(ecies).  You should do
> testing to see whether the handling of ties agrees with your needs.
>
> --
> David Winsemius
>
> On Feb 12, 2:56 pm, "Christos Hatzis" <christos.hat... at nuverabio.com>
> wrote:
>> I don't have an easy solution with aggregate, because the function in
>> aggregate needs to return a scalar.
>> But the following should work:
>>
>> do.call("rbind", lapply(split(xveg, xveg$loc), function(x)
>> x[which.max(x$tot), ]))
>>
>>    loc sp tot
>> L1  L1  b  60
>> L2  L2  e  30
>> L3  L3  b  68
>>
>> -Christos
>>
>>
>>
>>> -----Original Message-----
>>> From: r-help-boun... at r-project.org
>>> [mailto:r-help-boun... at r-project.org] On Behalf Of Monica Pisica
>>> Sent: Thursday, February 12, 2009 1:58 PM
>>> To: R help project
>>> Subject: [R] Aggregrate function
>>
>>> Hi,
>>
>>> I have to recognize that i don't fully understand the
>>> aggregate function, but i think it should help me with what i
>>> want to do.
>>
>>> xveg is a data.frame with location, species, and total for
>>> the species. Each location is repeated, once for every
>>> species present at that location. For each location i want to
>>> find out which species has the maximum total ... so i've
>>> tried different ways to do it using aggregate.
>>
>>> loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2)) sp <-
>>> c("a", "b", "c", "a", "d", "b", "e", "c", "b", "d") tot <-
>>> c(20, 60, 40, 15, 25, 10, 30, 20, 68, 32) xveg <-
>>> data.frame(loc, sp, tot)
>>
>>> result desired:
>>
>>> L1   b
>>> L2   e
>>> L3   b
>>
>>> sp_maj <- aggregate(xveg[,2], list(xveg[,1], function(x)
>>> levels(x)[which.max(table(x))])
>>
>>> This is wrong because it gives the first species name in each
>>> level of location, so i get a, a, b, as species instead of b, e, b.
>>
>>> I've tried other few aggregate commands, all with wrong results.
>>
>>> I will appreciate any help,
>>
>>> Thanks,
>>
>>> Monica
>>
>>> _________________________________________________________________
>>
>>>  the go.
>>
>>> ______________________________________________
>>> R-h... at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/ 
>> listinfo/r-help
>> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list