[R] Aggregrate function

Phil Spector spector at stat.berkeley.edu
Thu Feb 12 23:05:44 CET 2009


Monica -
    Here's a more compact version of the  same idea:

   do.call(rbind,by(xveg,xveg['loc'],function(x)x[x$tot == max(x$tot),]))

                                        - Phil Spector
 					 Statistical Computing Facility
 					 Department of Statistics
 					 UC Berkeley
 					 spector at stat.berkeley.edu


On Thu, 12 Feb 2009, Monica Pisica wrote:

>
> Hi,
>
> Thanks for the solution. Mark Leeds sent me privately a very similar solution. My next question to him was:
>
> Suppose that for a certain location 2 species have the same maximum total ... (there are ties in the data for a particular location). How do i get all species that have that max. total??
>
> For this case i have changed the tot as follows:
>
> tot <-  c(20, 60, 40, 15, 25, 15, 25, 20, 68, 32)
>
> His sollution is (and does work):
>
> temp <- lapply(split(xveg,loc), function(.df) {
>  maxindices <- which(.df$tot == .df$tot[which.max(.df$tot)])
>  data.frame(loc=.df$loc[1],sp=paste(.df$sp[maxindices],collapse=","),tot=max(.df$tot))
> })
>
> result <- do.call(rbind,temp)
> print(result)
>
> Thanks so much again,
>
> Monica
>
>
>
>> From: christos.hatzis at nuverabio.com
>> To: pisicandru at hotmail.com; r-help at r-project.org
>> Subject: RE: [R] Aggregrate function
>> Date: Thu, 12 Feb 2009 15:56:38 -0500
>>
>> I don't have an easy solution with aggregate, because the function in
>> aggregate needs to return a scalar.
>> But the following should work:
>>
>> do.call("rbind", lapply(split(xveg, xveg$loc), function(x)
>> x[which.max(x$tot), ]))
>>
>> loc sp tot
>> L1 L1 b 60
>> L2 L2 e 30
>> L3 L3 b 68
>>
>> -Christos
>>
>>> -----Original Message-----
>>> From: r-help-bounces at r-project.org
>>> [mailto:r-help-bounces at r-project.org] On Behalf Of Monica Pisica
>>> Sent: Thursday, February 12, 2009 1:58 PM
>>> To: R help project
>>> Subject: [R] Aggregrate function
>>>
>>>
>>> Hi,
>>>
>>> I have to recognize that i don't fully understand the
>>> aggregate function, but i think it should help me with what i
>>> want to do.
>>>
>>> xveg is a data.frame with location, species, and total for
>>> the species. Each location is repeated, once for every
>>> species present at that location. For each location i want to
>>> find out which species has the maximum total ... so i've
>>> tried different ways to do it using aggregate.
>>>
>>> loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2)) sp <-
>>> c("a", "b", "c", "a", "d", "b", "e", "c", "b", "d") tot <-
>>> c(20, 60, 40, 15, 25, 10, 30, 20, 68, 32) xveg <-
>>> data.frame(loc, sp, tot)
>>>
>>> result desired:
>>>
>>> L1 b
>>> L2 e
>>> L3 b
>>>
>>> sp_maj <- aggregate(xveg[,2], list(xveg[,1], function(x)
>>> levels(x)[which.max(table(x))])
>>>
>>> This is wrong because it gives the first species name in each
>>> level of location, so i get a, a, b, as species instead of b, e, b.
>>>
>>> I've tried other few aggregate commands, all with wrong results.
>>>
>>> I will appreciate any help,
>>>
>>> Thanks,
>>>
>>> Monica
>>>
>>> _________________________________________________________________
>>>
>>> the go.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
> _________________________________________________________________
>
> of your life.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list