[R] Best way to compute the difference between two levels of a factor ?
ehlers at ucalgary.ca
Wed Mar 21 13:27:11 CET 2012
Here's the plyr way I should have thought of earlier:
ddply(data, "ID", numcolwise(diff))
Still requires your data to be ordered.
On 2012-03-21 04:51, Eik Vettorazzi wrote:
> Hi Sylvain,
> assuming your data frame is ordered by ID and TIME, how about this
> aggregate(cbind(X,Y)~ID,data, function(x)(x-x))
> #or doing this for all but the first 2 columns of data:
> aggregate(data[,-(1:2)],by=list(data$ID), function(x)(x-x))
> Am 21.03.2012 09:48, schrieb wphantomfr:
>> Dear R-help Members,
>> I am wondering if anyone think of the optimal way of computing for
>> several numeric variable the difference between 2 levels of a factor.
>> To be clear let's generate a simple data frame with 2 numeric variables
>> collected for different subjects (ID) and 2 levels of a TIME factor
>> (time of evaluation)
>> ID TIME X Y
>> 1 AA T1 9.959540 11.140529
>> 2 AA T2 12.949522 9.896559
>> 3 BB T1 9.039486 13.469104
>> 4 BB T2 10.056392 14.632169
>> 5 CC T1 8.706590 14.939197
>> 6 CC T2 10.799296 10.747609
>> I want to compute for each subject and each variable (X, Y, ...) the
>> difference between T2 and T1.
>> Until today I do it by reshaping my dataframe to the wide format (the
>> columns are then ID, X.T1, X.T2, Y.T1,Y.T2) and then compute the
>> difference between successive columns one by one :
>> but this way is probably not optimal if the difference has to be
>> computed for a large number of variables.
>> How will you handle it ?
>> Thanks in advance
>> Sylvain Clément
More information about the R-help