[R] Calculating group means using self-written function

Lauri Nikkinen lauri.nikkinen at iki.fi
Tue Oct 2 14:35:55 CEST 2007


Well, I finally found a roundabout

fun <- function(x, y) sum(x)/max(y)
aggregate(vsid$lev, list(vsid$month, vsid$year), fun, y=by(vsid$date,
vsid$month, function(x) length(unique(x))))

Thanks,
Lauri

2007/10/2, Lauri Nikkinen <lauri.nikkinen at iki.fi>:
> Thanks Petr for your kind answer. I got it now but it seems that
> argument y will not be split by "list(vsid$month, vsid$year)" in the
> aggregate function. I should get number of days in each month in the
> denominator with "length(unique(y))" but instead I get sum of days in
> months in the denominator. So I will not get correct answers. Should I
> modify my fun in some way?
>
> Best regards,
> Lauri
>
> 2007/10/2, Petr PIKAL <petr.pikal at precheza.cz>:
> > Hi
> >
> > lauri.nikkinen at gmail.com napsal dne 02.10.2007 13:19:09:
> >
> > > Thanks Petr,
> > >
> > > Yes, your code seems to work. But when I try to reproduce it with my
> > > original data set
> > >
> > > fun <- function(x, y) sum(x)/length(unique(y))
> > > aggregate(vsid$lev, list(vsid$month, vsid$yeari), fun,
> > vsid$lev=vsid$date)
> >
> > Shall be
> >
> > aggregate(vsid$lev, list(vsid$month, vsid$yeari), fun, y=vsid$date)
> >
> > From help page
> >
> > ## S3 method for class 'data.frame':
> > aggregate(x, by, FUN, ...)
> >
> > ...
> > further arguments passed to or used by methods.
> >
> > Your function has 2 arguments one is x which is assigned vsid$lev and the
> > other is y which you want to assign vsid$date. You can imagine that
> > aggregate splits your "x" according to the levels mentioned in "by" and
> > applies to each split a function "fun" together with any other argument,
> > in your case "y". So you need to provide a correct name to your function
> > otherwise it does not know what to do.
> >
> > Regards
> > Petr
> >
> > >
> > > I get
> > >
> > > Error: syntax error, unexpected EQ_ASSIGN, expecting ',' in
> > > "aggregate(vsid$lev, list(vsid$month, vsid$year), fun, vsid$lev="
> > >
> > > Can you intepret what is wrong? vsid$date is
> > >
> > > $ date       :Class 'Date'  num [1:637] 13695 13695 13695 13695 13695
> > ...
> > >
> > > Cheers,
> > > Lauri
> > >
> > > 2007/10/2, Petr PIKAL <petr.pikal at precheza.cz>:
> > > > Hi
> > > >
> > > > r-help-bounces at r-project.org napsal dne 02.10.2007 10:44:20:
> > > >
> > > > > Hi R-users,
> > > > >
> > > > > Suppose I have a following data set.
> > > > >
> > > > > y1 <- rnorm(20) + 6.8
> > > > > y2 <- rnorm(20) + (1:20*1.7 + 1)
> > > > > y3 <- rnorm(20) + (1:20*6.7 + 3.7)
> > > > > y <- c(y1,y2,y3)
> > > > > var1 <- rep(1:5,12)
> > > > > z <- rep(1:6,10)
> > > > > f <- gl(3,20, labels=paste("lev", 1:3, sep=""))
> > > > > d <- data.frame(var1=var1, z=z,y=y, f=f)
> > > > >
> > > > > Using following code I can calculate group means
> > > > >
> > > > > library(doBy)
> > > > > summaryBy(y ~ f + var1, data=d, FUN=mean)
> > > > >
> > > > > How do I have to modify the FUN argument if I want to calculate mean
> > > > > using unique values
> > > > >
> > > > > for instance
> > > > >
> > > > > fun <- function(x, y) sum(x)/length(unique(y))
> > > > > summaryBy(y ~ f + var1, data=d, FUN=fun(y, z)
> > > > >
> > > > > Error in get(x, envir, mode, inherits) : variable "currFUN" of mode
> > > > > "function" was not found
> > > >
> > > > Not sure how to do it in doBy but using aggregate
> > > >
> > > > aggregate(d$y, list(d$var1,d$f), fun, y=z)
> > > >
> > > > probably do what you want
> > > >
> > > > Regards
> > > > Petr
> > > >
> > > > >
> > > > > Best regards
> > > > > LN
> > > > >
> > > > > ______________________________________________
> > > > > R-help at r-project.org mailing list
> > > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > > PLEASE do read the posting guide
> > > > http://www.R-project.org/posting-guide.html
> > > > > and provide commented, minimal, self-contained, reproducible code.
> > > >
> > > >
> >
> >
>



More information about the R-help mailing list