[R] summing a large, partitioned data frame

james.foadi at diamond.ac.uk james.foadi at diamond.ac.uk
Mon Jan 25 17:07:16 CET 2010


Dear R community,
I'm trying to develop a fast way of summing specific rows of a large data frame.
Here is an example of the kind of data frames I'm dealing with:

> refls
      H K L M/ISYM BATCH          I     SIGI
43247 1 0 5     21    79   61.44117  2.20553
1040  1 0 5    257     6   15.16316  0.54431
2324  1 0 5    257     5   46.76152  1.67858
31515 1 0 5    259    60   57.97305  2.08104
35158 1 0 5    259    61    3.15614  0.11329
51575 1 0 6    259    88  380.04477  8.08878
51846 1 0 6    259    89  624.90802 13.30038
28946 1 1 4      1    42 2517.79492 55.37144
23199 1 1 4      5    31 2525.67407 55.54472
23198 1 1 4     21    39 2519.44653 55.40777
............................................
............................................

I need to add up all I's with same H, K, L and M/ISYM.
The new data frame coming out of this partial summing should look, in this case, like:

      H K L M/ISYM BATCH          I     SIGI
43247 1 0 5     21    79   61.44117  2.20553
1040  1 0 5    257     6   61.92468  0.54431
31515 1 0 5    259    60   61.12919  2.08104
51575 1 0 6    259    88 1004.95279  8.08878
28946 1 1 4      1    42 2517.79492 55.37144
23199 1 1 4      5    31 2525.67407 55.54472
23198 1 1 4     21    39 2519.44653 55.40777
............................................
............................................


Essentially I only add those I's with same H, K, L, M/ISYM and replace the sum
in a unique row in the new data frame. In other words there's first a partition and then
a sum.

I have tried with a for loop, but it really takes too long.

I was wondering whether anyone knows of a better and faster way of doing this operation.


J



Dr James Foadi PhD
Membrane Protein Laboratory (MPL)
Diamond Light Source Ltd
Diamond House
Harewell Science and Innovation Campus
Chilton, Didcot
Oxfordshire OX11 0DE

Email    :  james.foadi at diamond.ac.uk
Alt Email:  j.foadi at imperial.ac.uk

-- 
This e-mail and any attachments may contain confidential...{{dropped:8}}



More information about the R-help mailing list