[R] Fast Normalize by Group

Peter Langfelder peter.langfelder at gmail.com
Thu Nov 29 20:02:48 CET 2012


Not tested but should work:

sums = tapply(x, group, sum);
sums.ext = sums[ match(group, names(sums))]
normalized = x/sums.ext

It may be that the tapply is just as slow as your loop though, I'm not sure.

HTH,

Peter


On Thu, Nov 29, 2012 at 10:55 AM, Noah Silverman <noahsilverman at ucla.edu> wrote:
> Hi,
>
> I have a very large data set (aprox. 100,000 rows.)
>
> The data comes from around 10,000 "groups" with about 10 entered per group.
>
> The values are in one column, the group ID is an integer in the second column.
>
> I want to normalize the values by group:
>
> for(g in unique(groups){
>         x[group==g] / sum(x[group==g])
> }
>
> This works find in a loop, but is slow.  Is there a faster way to do this?
>
> Thanks!
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list