[R] MCMCglmm priors including phylogeny

Stott, Iain ims203 at exeter.ac.uk
Wed May 2 03:23:25 CEST 2012


Hi all,

I'm hoping I might be able to get some help with some issues specifying priors for MCMCglmm.

I'm trying to fit a gaussian glmm using MCMCglmm to a data set with two (correlated) response variables.  The response variables are both logit-transformed proportions (there are a few reasons why I've chosen these with gaussian error over binomal glmm, which I won't  go into).  Each probability represents a response for a single plant species (c.150 species), and I want to incorporate the phylogeny (almost fully resolved with only 2 polytomies, including branch lengths) in the error structure.  There is only 1 explanatory variable, a 5-level factor.  I'm using uninformative priors for fixed parameters (i.e. not specifying any priors).

Here's my code:

#prior model1
prior1<-list(R = list(V = diag(2)*x, n=2),
             G = list(G1=list(V = diag(2)*x, n=2)))

#model1
m1<-MCMCglmm(cbind(y1.logit,y2.logit) ~ trait:Ecotype - 1, 
             random=~us(trait):animal, rcov=~us(trait):units, family=c("gaussian","gaussian"),
             prior=prior1, data=meandata, pedigree=meantree, nodes="TIPS",
             thin=100, nitt=150000, burnin=30000, verbose=F)

#prior model2
prior2<-list(R = list(V = diag(2)*x, n=2))

#model2
m2<-MCMCglmm(cbind(reac.prob.logit,inert.prob.logit) ~ trait:Ecotype - 1, 
             rcov=~us(trait):units, family=c("gaussian","gaussian"),
             prior=prior2, data=meandata, 
             thin=100, nitt=150000, burnin=30000, verbose=F)

I've tried prior variance values of 5, 10, 50, 100, 500, 1000, 5000 and 10000 (by varying x in the code above), and I've used a large thinning interval (100) to deal with autocorrelation arising from the correlated responses. Chains seem to mix well as long as I use enough iterations (over 100000 seems to be best).  Posterior distributions from plot(model) are normal-distributed for fixed effects and residuals in every case.  However, posteriors for errors don't behave as nicely:
- Lower prior values (superficially <100) result in positively-skewed distributions for trait:animal whereas higher prior values (superficially >100) are much more normal.  With the logit data running from ~-3.5 to ~10, it seems that the prior values that result in well-fitting models are higher than the variance of the data itself.  
- Fixed parameter values vary (but not considerably) with choice of prior, but error parameters seem to be more sensitive to prior choice.  
- Posterior means for errors and residuals are larger and more variable for y1:y1 and y2:y2 combinations (in the hundreds) but smaller for y1:y2 and y2:y1 combinations (in the tens).

Considering this, my first question is: what are appropriate priors for G and R?

I'm aware that my priors assume a priori independence between the two responses, and I know that they will covary, but my understanding is that specifying us(trait) in my errors and residuals in my residuals should estimate covariance anyway.  But, my second question is this: what are the merits in using a multi-response model?  The phylogenetic models are taking a fair amount of time to run, and I hope that without the need to estimate covariances, they might be a bit speedier.

At some point, I will want to incorporate replication within species in the analysis (populations nested within species), but for now I'm really just interested in looking at mean response per species.  I'm assuming that priors will probably be of similar value (but more of them) once I include within-species replication... or will I have to go through this again??


Any help appreciated, thanks in advance

Iain


--------------------------------------------------------------------------------
Iain Stott
Centre for Ecology and Conservation
University of Exeter, Cornwall Campus
Tremough
Treliever Road
Penryn
Cornwall
TR10 9EZ
Tel: 01326 371852
http://biosciences.exeter.ac.uk/cec/staff/postgradresearch/iainstott/
--------------------------------------------------------------------------------


More information about the R-help mailing list