[R] by function ??

Matthew Dowle mdowle at mdowle.plus.com
Mon Dec 21 11:42:37 CET 2009


or if Dataset is a data.table :

> Dataset = data.table(Dataset)
> Dataset[,abs(ratio-median(ratio)),by="LEAID"]
     LEAID        V1
[1,]  6307 0.0911905
[2,]  6307 0.0488095
[3,]  6307 0.0488095
[4,]  6307 0.1088095
[5,]  8300 0.2021538
[6,]  8300 0.0000000
[7,]  8300 0.0600000
rather than :
> Dataset$abs <- with(Dataset, ave(ratio, LEAID, 
> FUN=function(x)abs(x-median(x))))

This is less code and more natural (to me anyway) e.g. it doesn't require 
use of function() or ave(). data.table knows that if the j expression 
returns a vector it should silently repeat the groups to match the length of 
the j result (which it is doing here).   If the j expression returns a 
scalar you would just get 2 rows in this example.  Note that the 'by' 
expression must evaluation to integer, or a list of integer vectors,  so in 
this case LEAID must either be integer already or coerced to integer using 
by="as.integer(LEAID)".

To give the aggregate expression a name, just wrap with the DT function. 
This is also how to return multiple aggregate functions from each subset 
(some may return vectors, others may return vectors) by listing them inside 
DT() :

> Dataset[,DT(ratio,scaled=abs(ratio-median(ratio)),sum=sum(ratio)),by="LEAID"]
     LEAID     ratio    scaled      sum
[1,]  6307 0.7200000 0.0911905 3.262381
[2,]  6307 0.7623810 0.0488095 3.262381
[3,]  6307 0.8600000 0.0488095 3.262381
[4,]  6307 0.9200000 0.1088095 3.262381
[5,]  8300 0.5678462 0.2021538 2.167846
[6,]  8300 0.7700000 0.0000000 2.167846
[7,]  8300 0.8300000 0.0600000 2.167846


"William Dunlap" <wdunlap at tibco.com> wrote in message 
news:77EB52C6DD32BA4D87471DCD70C8D7000243CBA1 at NA-PA-VBE03.na.tibco.com...
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of L.A.
> Sent: Saturday, December 12, 2009 12:39 PM
> To: r-help at r-project.org
> Subject: Re: [R] by function ??
>
>
>
> Thanks for all the help, They all worked, But I'm stuck again.
> I've tried searching, but I not sure how to word my search as
> nothing came
> up.
> Here is my new hurdle, my data has 7 abservations and my
> results have 2
> answers:
>
>
> Here is my data
>
>      LEAID     ratio
> 3 6307     0.7200000
> 1 6307     0.7623810
> 2 6307     0.8600000
> 4 6307     0.9200000
> 5 8300     0.5678462
> 7 8300     0.7700000
> 6 8300     0.8300000
>
>
> > median<-summaryBy(ratio ~ LEAID, data = Dataset, FUN = median)
>
> > print(median)
>   LEAID       ratio.median
> 1 6307        0.8111905
> 2 8300        0.7700000
>
> Now what I want is a way to compute
> abs(ratio- median)by LEAID for each observation to produce
> something like
> this
>
> LEAID     ratio          abs
> 3 6307     0.7200000     .0912
> 1 6307     0.7623810     .0488
> 2 6307     0.8600000     .0488
> 4 6307     0.9200000     .1088
> 5 8300     0.5678462     .2022
> 7 8300     0.7700000     .0000
> 6 8300     0.8300000     .0600

Try ave(), as in
   > Dataset$abs <- with(Dataset, ave(ratio, LEAID, 
FUN=function(x)abs(x-median(x))))
   > Dataset
     LEAID     ratio       abs
   3  6307 0.7200000 0.0911905
   1  6307 0.7623810 0.0488095
   2  6307 0.8600000 0.0488095
   4  6307 0.9200000 0.1088095
   5  8300 0.5678462 0.2021538
   7  8300 0.7700000 0.0000000
   6  8300 0.8300000 0.0600000

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

>
> Thanks,
> L.A.
>
>
>
>
> Ista Zahn wrote:
> >
> > Hi,
> > I think you want
> >
> > by(TestData[ , "RATIO"], LEAID, median)
> >
> > -Ista
> >
> > On Tue, Dec 8, 2009 at 8:36 PM, L.A. <romsa at millect.com> wrote:
> >>
> >> I'm just learning and this is probably very simple, but I'm stuck.
> >> I'm trying to understand the by().
> >> This works.
> >> by(TestData, LEAID, summary)
> >>
> >> But, This doesn't.
> >>
> >> by(TestData, LEAID, median(RATIO))
> >>
> >>
> >> ERROR: could not find function "FUN"
> >>
> >> HELP!
> >> Thanks,
> >> LA
> >> --
> >> View this message in context:
> >> http://n4.nabble.com/by-function-tp955789p955789.html
> >> Sent from the R help mailing list archive at Nabble.com.
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
> >
> > -- 
> > Ista Zahn
> > Graduate student
> > University of Rochester
> > Department of Clinical and Social Psychology
> > http://yourpsyche.org
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
> -- 
> View this message in context:
> http://n4.nabble.com/by-function-tp955789p962666.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list