[R] Summary information by groups programming assitance

Søren Højsgaard Soren.Hojsgaard at agrsci.dk
Mon Dec 22 23:25:46 CET 2008


Maybe summaryBy (or lapplyBy/splitBy) in the doBy package might help you.
Regards
Søren

________________________________

Fra: r-help-bounces at r-project.org på vegne af Ranney, Steven
Sendt: ma 22-12-2008 22:51
Til: r-help at r-project.org
Emne: [R] Summary information by groups programming assitance



All -

I have data that looks like

          psd   Species Lake Length  Weight    St.weight    Wr
Wr.1     vol
432  substock     SMB      Clear    150   41.00      0.01  95.12438
95.10118  0.0105
433  substock     SMB      Clear    152   39.00      0.01  86.72916
86.70692  0.0105
434  substock     SMB      Clear    152   40.00      3.11  88.95298
82.03689  3.2655
435  substock     SMB      Clear    159   48.00      0.04  92.42095
92.34393  0.0420
436  substock     SMB      Clear    159   48.00      0.01  92.42095
92.40170  0.0105
437  substock     SMB      Clear    165   47.00      0.03  80.38023
80.32892  0.0315
438  substock     SMB      Clear    171   62.00      0.21  94.58105
94.26070  0.2205
439  substock     SMB      Clear    178   70.00      0.01  93.91912
93.90571  0.0105
440  substock     SMB      Clear    179   76.00      1.38 100.15760
98.33895  1.4490
441       S-Q     SMB      Clear    180   75.00      0.01  97.09330
97.08035  0.0105
442       S-Q     SMB      Clear    180   92.00      0.02 119.10111
119.07522  0.0210
...
[truncated]

where psd and lake are categorical variables, with five and four
categories, respectively.  I'd like to find the maximum vol and the
lengths associated with each maximum vol by each category by each lake.
In other words, I'd like to have a data frame that looks something like

Lake            Category        Length  vol
Clear           substock        152             3.2655
Clear           S-Q             266             11.73
Clear           Q-P             330             14.89
...
Pickerel        substock        170             3.4965
Pickerel        S-Q             248             10.69
Pickerel        Q-P             335             25.62
Pickerel        P-M             415             32.62
Pickerel        M-T             442             17.25  


In order to originally get this, I used

with(smb[Lake=="Clear",], tapply(vol, list(Length, psd),max))
with(smb[Lake=="Enemy.Swim",], tapply(vol, list(Length, psd),max))
with(smb[Lake=="Pickerel",], tapply(vol, list(Length, psd),max))
with(smb[Lake=="Roy",], tapply(vol, list(Length, psd),max))

and pulled the values I needed out by hand and put them into a .csv.
Unfortunately, I've got a number of other data sets upon which I'll need
to do the same analysis.  Finding a programmable alternative would
provide a much easier (and likely less error prone) method to achieve
the same results.  Ideally, the "Length" and "vol" data would be in a
data frame such that I could then analyze with nls. 

Does anyone have any thoughts as to how I might accomplish this? 

Thanks in advance,

Steven Ranney  

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list