[R] sample size mixed poisson glm

Thu Mar 25 18:42:49 CET 2010

Dear list members,

I am trying to estimate  power/sample size based on already collected
pilot data.
The setting: Five orchards have been sampled with respect to fruit
damage. In each orchard two plots were selected (semi-randomly), where
one was treated and one left as control. Within each plot around 50
trees were randomly selected, and the number of damaged fruits counted
for each tree. The overall question to be answered is whether the
treatment will have an effect (hopefully negative) on number of
damaged fruits counted. To take care of the expected within-orchard
correlation in counts, I proceed with a mixed poisson glm, and find a
significant effect as expected. I then proceed in estimating power and
sample size needed to obtain significant results at the 95% level. The
aim is to find the number of orchards and the number of trees in each
of the two plots per orchard, needed to obtain reliable results for
the effect-size observed in the pilot data.
I found that simulating from the fitted model would be most likely the
best way to approach this problem.

#The model
m.mix<-lmer(counts~treat+(1|orchard),family="poisson",data=dat)

#and simulations using

simu.pois.mixed<-function(object,ngroups,nsubj,nsim,fact=1)
{
  S<-c()
  for(i in 1:nsim){
    rands<-mvrnorm(ngroups,0,Sigma=fact*VarCorr(object)[[1]])
    nwd<-unique(object at X)
    # repeat this ngoups X nsubj/group times
    nwd2<-do.call("rbind",replicate(nsubj,nwd,FALSE))
    nwd3<-do.call("rbind",replicate(ngroups,nwd2,FALSE))
    mu<-nwd3%*%fixef(object)+ rep(rands,each=nsubj)
    newresp<-rpois(length(mu),poisson()$linkinv(mu))
    nwdata<-data.frame(newresp,nwd3[,-1],factor(rep(1:length(rands),each=nsubj)))
    names(nwdata)<-names(object at frame)
    S[i]<-ifelse(summary(update(object,data=nwdata))@coefs["treat","Pr(>|z|)"]<=.05,1,0)
    }
    cat("\n\tPower is:",mean(S),"based
on",nsim,"simulations",ngroups,"groups and",nsubj,"subjects per
group\n")
}

simu.pois.mixed(m.mixed,ngroups=5,nsubj=55,nsim=100,fact=1)

which results to
"Power is: 0.99 based on 100 simulations 5 groups and 55 subjects per group"

based on which I conclude that I already have a power of about 1, even
with only 5 orchards!!!. I find it hard to believe it though, so I
wonder whether this whole approach is flawed in some way, and if so,
what are the alternatives. Any other comments or recommendations are
most welcome btw.

Thanks in advance

Caspar