[R] conditionally merging adjacent rows in a data frame

Gabor Grothendieck ggrothendieck at gmail.com
Wed Dec 9 14:12:38 CET 2009


On Wed, Dec 9, 2009 at 7:59 AM, Titus von der Malsburg
<malsburg at gmail.com> wrote:
> On Wed, Dec 9, 2009 at 12:11 AM, Gabor Grothendieck
> <ggrothendieck at gmail.com> wrote:
>> Here are a couple of solutions.  The first uses by and the second sqldf:
>
> Brilliant!  Now I have a whole collection of solutions.  I did a simple
> performance comparison with a data frame that has 7929 lines.
>
> The results were as following (loading appropriate packages is not included in
> the measurements):
>
>  times <- c(0.248, 0.551, 41.080, 0.16, 0.190)
>  names(times) <- c("aggregate","summaryBy","by+transform","sqldf","tapply")
>  barplot(times, log="y", ylab="log(s)")
>
> So sqldf clearly wins followed by tapply and aggregate.  summaryBy is slower
> than necessary because it computes for x and dur both, mean /and/ sum.
> by+transform presumably suffers from the contruction of many intermediate data
> frames.
>
> Are there any canonical places where R-recipes are collected?  If yes I would
> write-up a summary.

If you google for
   R wiki
its the first hit.




More information about the R-help mailing list