[R] Aggregrate function

Monica Pisica pisicandru at hotmail.com
Fri Feb 13 14:51:12 CET 2009

```Hi again,

Thanks a lot for all the suggestions. It will take me a little bit to wrap my head around to understand what is what, though! This will help me quite a bit.

One difference in the result output between you're solution and Mark's solution is this:
loc sp tot
L1    L1  b  60
L2.5  L2  d  25
L2.7  L2  e  25
L3    L3  b  68

And Mark's solution:
loc  sp tot
L1  L1   b  60
L2  L2 d,e  25
L3  L3   b  68

I will probably use both type of solutions depending what else i need to do with the data.

Thank you all for your help,

Monica

----------------------------------------
> Date: Thu, 12 Feb 2009 14:05:44 -0800
> From: spector at stat.berkeley.edu
> To: pisicandru at hotmail.com
> CC: christos.hatzis at nuverabio.com; r-help at r-project.org; markleeds at verizon.net
> Subject: Re: [R] Aggregrate function
>
> Monica -
> Here's a more compact version of the same idea:
>
> do.call(rbind,by(xveg,xveg['loc'],function(x)x[x\$tot == max(x\$tot),]))
>
> - Phil Spector
> Statistical Computing Facility
> Department of Statistics
> UC Berkeley
> spector at stat.berkeley.edu
>
>
> On Thu, 12 Feb 2009, Monica Pisica wrote:
>
>>
>> Hi,
>>
>> Thanks for the solution. Mark Leeds sent me privately a very similar solution. My next question to him was:
>>
>> Suppose that for a certain location 2 species have the same maximum total ... (there are ties in the data for a particular location). How do i get all species that have that max. total??
>>
>> For this case i have changed the tot as follows:
>>
>> tot <- c(20, 60, 40, 15, 25, 15, 25, 20, 68, 32)
>>
>> His sollution is (and does work):
>>
>> temp <- lapply(split(xveg,loc), function(.df) {
>> maxindices <- which(.df\$tot == .df\$tot[which.max(.df\$tot)])
>> data.frame(loc=.df\$loc[1],sp=paste(.df\$sp[maxindices],collapse=","),tot=max(.df\$tot))
>> })
>>
>> result <- do.call(rbind,temp)
>> print(result)
>>
>> Thanks so much again,
>>
>> Monica
>>
>>
>>
>>> From: christos.hatzis at nuverabio.com
>>> To: pisicandru at hotmail.com; r-help at r-project.org
>>> Subject: RE: [R] Aggregrate function
>>> Date: Thu, 12 Feb 2009 15:56:38 -0500
>>>
>>> I don't have an easy solution with aggregate, because the function in
>>> aggregate needs to return a scalar.
>>> But the following should work:
>>>
>>> do.call("rbind", lapply(split(xveg, xveg\$loc), function(x)
>>> x[which.max(x\$tot), ]))
>>>
>>> loc sp tot
>>> L1 L1 b 60
>>> L2 L2 e 30
>>> L3 L3 b 68
>>>
>>> -Christos
>>>
>>>> -----Original Message-----
>>>> From: r-help-bounces at r-project.org
>>>> [mailto:r-help-bounces at r-project.org] On Behalf Of Monica Pisica
>>>> Sent: Thursday, February 12, 2009 1:58 PM
>>>> To: R help project
>>>> Subject: [R] Aggregrate function
>>>>
>>>>
>>>> Hi,
>>>>
>>>> I have to recognize that i don't fully understand the
>>>> aggregate function, but i think it should help me with what i
>>>> want to do.
>>>>
>>>> xveg is a data.frame with location, species, and total for
>>>> the species. Each location is repeated, once for every
>>>> species present at that location. For each location i want to
>>>> find out which species has the maximum total ... so i've
>>>> tried different ways to do it using aggregate.
>>>>
>>>> loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2)) sp <-
>>>> c("a", "b", "c", "a", "d", "b", "e", "c", "b", "d") tot <-
>>>> c(20, 60, 40, 15, 25, 10, 30, 20, 68, 32) xveg <-
>>>> data.frame(loc, sp, tot)
>>>>
>>>> result desired:
>>>>
>>>> L1 b
>>>> L2 e
>>>> L3 b
>>>>
>>>> sp_maj <- aggregate(xveg[,2], list(xveg[,1], function(x)
>>>> levels(x)[which.max(table(x))])
>>>>
>>>> This is wrong because it gives the first species name in each
>>>> level of location, so i get a, a, b, as species instead of b, e, b.
>>>>
>>>> I've tried other few aggregate commands, all with wrong results.
>>>>
>>>> I will appreciate any help,
>>>>
>>>> Thanks,
>>>>
>>>> Monica
>>>>
>>>> _________________________________________________________________
>>>>
>>>> the go.
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>
>>>
>> _________________________________________________________________
>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help