[R] Not sure if this is "aggregate" or some other task.

Gabor Grothendieck ggrothendieck at gmail.com
Mon May 16 04:45:56 CEST 2005


On 5/15/05, David L. Van Brunt, Ph.D. <dlvanbrunt at gmail.com> wrote:
> I have data where where I've taken some measurements three times... twice in
> rapid succession so I could check test-retest reliability of a piece of
> equipment, and then a third measurement some time later.
> 
> Not I'd like to do an analysis where I have two scores... the first being
> the mean of the first two taken the same day, and the second being the one
> taken later.
> 
> I have a lot of other variables in the row, and I'd like to do the same
> thing to all of them. Soo....
> 
> Data.Frame:
> 
> Subj Obs MeasureA MeasureB
> 1 1 45 685
> 1 2 50 690
> 1 3 48 693
> 2 1 39 595
> 2 2 41 585
> 2 3 45 343
> 
> should become:
> Subj Obs MeasureA MeasureB
> 1 1 47.5 687.5
> 1 2 50 690
> 2 1 40 590
> 2 2 41 585
> 
> It seems like a job for "aggregate", but I want to collapse on only cases
> where observation # < 3, and take the mean of a few vars in the aggregation.
> I can't seem to make it work, and didn't find examples that were on the
> mark. I think I'm suffering from "prospective interference" and my SPSS
> syntax knowledge to do exactly this is just getting in my way.
> 
> any volunteers? I'd very grateful. Thanks!

This seems like a situation where you want to process the sub-data.frame
corresponding to each Subject.  'by' will do that.    If 'z' is your data.frame
then 'z.by' is an object of class "by" which has the desired result and
the last line converts that to a data.frame:


f <- function(x) {
      y <- colMeans(x[1:2,])
      y[2] <- 1
     rbind(y, x[2,])
}
z.by <- by(z, list(Subj = z$Subj), f)
do.call(rbind, z.by)




More information about the R-help mailing list