[R] question about the aggregate function with respect to order of levels of grouping elements

jim holtman jholtman at gmail.com
Sun Dec 16 16:04:25 CET 2007


What version of R are you using?  Here is the output I got with 2.6.1:

> library(chron)
> dts=seq.dates("1/1/01","12/31/03")
> rnum=rnorm(1:length(dts))
> df=data.frame(date=dts,obs=rnum)
> agg=aggregate(df[,2],list(year=years(df[,1]),month=months(df[,1])),sum)
> levels(agg$month) # aggregate() automatically generates levels sorted by alphabet.
 [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
>
> fmonth=factor(months(df[,1]))
> levels(fmonth) # factor() automatically generates the correct order of  levels.
 [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
> agg2=aggregate(df[,2],list(year=years(df[,1]),month=fmonth),sum)
> levels(agg2$month) # even if a factor with levels in the correct order is supplied, aggregate(), sortsthe levels by alphabet regardless.
 [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
>
>

Order seems to be correct.

On Dec 16, 2007 9:23 AM, tom soyer <tom.soyer at gmail.com> wrote:
> Hi,
>
> I am using aggregate() to add up groups of data according to year and month.
> It seems that the function aggregate() automatically sorts the levels of
> factors of the grouping elements, even if the order of the levels of factors
> is supplied. I am wondering if this is a bug, or if I missed something
> important. Below is an example that shows what I mean. Does anyone know if
> this is just the way the aggregate function works, or are there ways
> to force aggregate() to keep the order of levels of factors supplied by the
> grouping elements? Thanks!
>
> library(chron)
> dts=seq.dates("1/1/01","12/31/03")
> rnum=rnorm(1:length(dts))
> df=data.frame(date=dts,obs=rnum)
> agg=aggregate(df[,2],list(year=years(df[,1]),month=months(df[,1])),sum)
> levels(agg$month) # aggregate() automatically generates levels sorted by
> alphabet.
>
> [1] "Apr" "Aug" "Dec" "Feb" "Jan" "Jul" "Jun" "Mar" "May" "Nov" "Oct" "Sep"
>
> fmonth=factor(months(df[,1]))
> levels(fmonth) # factor() automatically generates the correct order of
> levels.
>
> [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
>
>
> agg2=aggregate(df[,2],list(year=years(df[,1]),month=fmonth),sum)
> levels(agg2$month) # even if a factor with levels in the correct order is
> supplied, aggregate(), sortsthe levels by alphabet regardless.
>
> [1] "Apr" "Aug" "Dec" "Feb" "Jan" "Jul" "Jun" "Mar" "May" "Nov" "Oct" "Sep"
>
>
> --
> Tom
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?



More information about the R-help mailing list