[Rd] Improved Data Aggregation and Summary Statistics in R

Joris Meys jor|@mey@ @end|ng |rom gm@||@com
Wed Feb 27 11:17:25 CET 2019


Dear Sebastian,

Initially I was a bit hesitant to think about yet another way to summarize
data, but your illustrations convinced me this is actually a great addition
to the toolset currently available in different R packages. Many of us have
written custom functions to get the required tables for specific data sets,
but this would reduce that effort to simply using the right collap() call.

Like Inaki, I'm very interested in trying it out if you have the code
available somewhere.

Cheers
Joris





On Wed, Feb 27, 2019 at 9:01 AM Sebastian Martin Krantz <
sebastian.krantz using graduateinstitute.ch> wrote:

> Dear Developers,
>
> Having spent time developing and thinking about how data aggregation and
> summary statistics can be enhanced in R, I would like to present my
> ideas/efforts in the form of two commands:
>
> The first, which for now I called 'collap', is an upgrade of aggregate that
> accommodates and extends the functionality of aggregate in various
> respects, most importantly to work with multilevel and multi-type data,
> multiple function calls, highly customized aggregation tasks, a much
> greater flexibility in the passing of inputs and tidy output.
>
> The second function, 'qsu', is an advanced and flexible summary command for
> cross-sectional and multilevel (panel) data (i.e. it can provide overall,
> between and within entities statistics, and allows for grouping, custom
> functions and transformations). It also provides a quick method to compute
> and output within-transformed data.
>
> Both commands are efficiently built from core R, but provide for optional
> integration with data.table, which renders them extremely fast on large
> datasets. An explanation of the syntax, a demonstration and benchmark
> results are provided in the attached vignette.
>
> Since both commands accommodate existing functionality while adding
> significant basic functionality, I though that their addition to the stats
> package would be a worthwhile consideration. I am happy for your feedback.
>
> Best regards,
>
> Sebastian Krantz
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>


-- 
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)
<https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g>

-----------
Biowiskundedagen 2018-2019
http://www.biowiskundedagen.ugent.be/

-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

	[[alternative HTML version deleted]]



More information about the R-devel mailing list