[R] Aggregating data

Dennis Murphy djmuser at gmail.com
Fri Aug 5 20:59:37 CEST 2011


Hi:

This is the type of problem at which the plyr package excels. Write a
utility function that produces the plot you want using a data frame as
its input argument, and then do something like

library('plyr')
d_ply(results, .(a, b, c), plotfun)

where plotfun is a placeholder for the name of the name of your plot
function. The d in d_ply means to take a data frame as input and _
means return nothing. This is used in particular when a side effect,
such as a plot, is the desired 'output'. See
http://www.jstatsoft.org/v40/i01, which contains an example (baseball)
where groupwise plots are produced. (Don't actually run the example
unless you're willing to wait for 1100+ ggplots to be rendered :)

If memory serves, you should also be able to produce graphics for each
data subset using the data.table package as well.

If you want a more concrete solution, provide a more concrete example.

HTH,
Dennis

On Fri, Aug 5, 2011 at 9:55 AM, Jeffrey Joh <johjeffrey at hotmail.com> wrote:
>
>
> I aggregated my data: aggresults <-aggregate(results, by=list(results$a, results$b, results$c), FUN=mean, na.rm=TRUE)
>
>
>
> results has about 8000 lines of data, and aggresults has about 80 lines.  I would like to create a separate variable for each of the 80 aggregates, each containing the 100 lines that were aggregated.  I would also like to create plots for each of those 80 datasets.
>
>
>
> Is there a way of automating this, so that I don't have to do each of the 80 aggregates individually?
>
>
>
> Jeff
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list