[R] Averaging over data sets

MacQueen, Don macqueen1 at llnl.gov
Fri Jan 13 22:58:38 CET 2012


Here is a solution that works for your small example.
It might be difficult to prepare your larger data sets to use the same
method.

db <-rbind(d1,d2)
aggregate(subset(db,select=-c(subject,trt)),
by=list(subject=db$subject),mean)
## or, for example,
aggregate(subset(db,select=-c(subject,trt)), by=list(subject=db$subject,
trt=db$trt),mean)

In order for aggregate() to work, its first argument must have only
numeric columns. That is what
subset(db,select=-c(subject,trt)) does for you.

(d1 + d2)/2 did not work because d1 and d2 are data frames, not numbers.
Much more complicated, you could have done your averages one at a time,
  (d1$eat1[d1$subject=='Felipe'] + d2$eat1[d2$subjedt=='Felipe'])/2
and similarly for eat3 and John. But that is of course not practical for
larger data sets.

-Don



-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 1/12/12 10:16 PM, "Felipe Nunes" <felipnunes at gmail.com> wrote:

>Hi all,
>
>after using Amelia II to create 10 imputed data sets I need to average
>them
>to have one unique data that includes the average for each cell of the
>variables imputed, in addition to the values for the variables not
>imputed.
>Such data has many variables (some numeric, other factors), and more than
>20000 observations. I do not know how to average them out. Any help?
>
>Below I provide a small example:
>
>Suppose Amelia provided two datasets:
>
>d1 <- data.frame(subject = c("Felipe", "John"), eat1 = 1:2, eat3 = 5:6,
>trt
>= c("t1", "t2"))
>
>d2 <- data.frame(subject = c("Felipe", "John"), eat1 = 3:4, eat3 = 6:7,
>trt
>= c("t1", "t2"))
>
>I tried
>
>(d1 + d2)/2
>
>but I lose my factors. mean() did not work either.
>
>The result I'd like is:
>
>     subject  eat1  eat3   trt
>1   Felipe     2      5.5     t1
>2     John      3      6.5     t2
>
>thanks,
>
>*Felipe Nunes*
>CAPES/Fulbright Fellow
>PhD Student Political Science - UCLA
>Web: felipenunes.bol.ucla.edu
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list