[R] Replicating output from a function

AC Del Re delre at wisc.edu
Wed Feb 17 21:58:21 CET 2010


Hi All,

I have a function that is used with data frames having multiple id's
per row and it aggregates the data down to 1 id per row. It also
randomly selects one of the within-id values of a variable (mod),
which often differ within-id.  Assume this data frame (below) is much
larger and I want to repeat this function, say 100 times, and then
derive the mean values of r over those 100 replications. Is there an
easy way to do this?  What about in more complex situations
where the output is r, var(r),  wi, etc, and a mean of all output is
desired,  e.g.:

id<-c(1,1,1,rep(4:12))
n<-c(10,20,13,22,28,12,12,36,19,12, 15,8)
r<-c(.68,.56,.23,.64,.49,-.04,.49,.33,.58,.18, .6,.21)
mod1<-factor(c(1,2,2, rep(c(1,2,3),3)))
mod2<-c(1,2,15,rep(3,9))
datas<-data.frame(id,n,r,mod1,mod2)

# intermediate level fuction (courtesy of Hadley Wickham):

pick_one <- function(x) {
 if (length(x) == 1) return(x)
 sample(x, 1)
}


# Function that I want replicated 100 times:

cat_sum1 <- function(meta, mod) {
 m <- meta
 m$mod <- mod
 meta <- ddply(m,  .(id),  summarize,  r = mean(r), n=mean(n),  mod =
pick_one(mod))
 meta$z  <- 0.5*log((1 + meta$r)/(1-meta$r))
 meta$var.z <- 1/(meta$n-3)
 meta$wi <-  1/meta$var.z
 return(meta)
}

# output from 1 run:

 cat_sum1(datas,datas$mod1)
  id     r        n       mod           z      var.z           wi
1   1  0.49 14.33333   2  0.53606034 0.08823529 11.33333
2   4  0.64 22.00000   1  0.75817374 0.05263158 19.00000
3   5  0.49 28.00000   2  0.53606034 0.04000000 25.00000
4   6 -0.04 12.00000   3 -0.04002135 0.11111111  9.00000
5   7  0.49 12.00000   1  0.53606034 0.11111111  9.00000
6   8  0.33 36.00000   2  0.34282825 0.03030303 33.00000
7   9  0.58 19.00000   3  0.66246271 0.06250000 16.00000
8  10  0.18 12.00000   1  0.18198269 0.11111111  9.00000
9  11  0.60 15.00000   2  0.69314718 0.08333333 12.00000
10 12  0.21  8.00000   3  0.21317135 0.20000000  5.00000

Is there a way that I could get this to run multiple times
(internally) and then output in a similar format as above but with the
mean values from the multiple runs?

Any help is much appreciated!

AC



More information about the R-help mailing list