[R] Memory limit in Aggregate()

peter dalgaard pdalgd at gmail.com
Tue Aug 2 20:09:57 CEST 2011


On Aug 2, 2011, at 19:09 , Guillaume wrote:

> Hi Peter,
> 
> Yes I have a large number of factors in the listBy table.
> 
> Do you mean that aggregate() creates a complete cartesian product of the
> "by" columns ? (and creates combinations of values that do not exist in the
> orignial "by" table, before removing them when returning the aggregated
> table?)

Hm, at least in recent versions that shouldn't happen. The "meat" of aggregate.data.frame is

        ans <- lapply(split(e, grp), FUN, ...)

where grp is a numerical coding of the factor combination for each cell. That could conceivably contain some large values, but since it is numeric (and not a factor with levels, say,  0:(n1*n2*n3*n4-1)), split should not generate more groups than are present in data. 

Some of this stuff was rewritten in Jan 2010. You might want to try a version which is later than yours from May 2009...

> 
> 
> Thanks a lot,
> Guillaume
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/Memory-limit-in-Aggregate-tp3711819p3713042.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
"Døden skal tape!" --- Nordahl Grieg



More information about the R-help mailing list