[R] Mixed effects multinomial regression and meta-analysis

Tue Mar 6 10:10:01 CET 2007

Here is my suggestion. 

Let P_i denote the true proportion in the ith study and p_i the corresponding observed proportion based on a sample of size n_i. Then we know that p_i is an unbiased estimate of P_i and if n_i is sufficiently large, we know that p_i is approximately normally distributed as long as P_i is not too close to 0 or 1. Moreover, we can estimate the sampling variance of p_i with p_i(1-p_i)/n_i. Alternatively, we can use the logit transformation, given by ln[p_i/(1-p_i)], whose distribution is approximately normal and whose sampling variance is closely approximated by 1/( n_i p_i (1-p_i) ). 

So, let 

y_i = p_i with the corresponding sampling variance v_i = p_i(1-p_i)/n_i

or let

y_i = ln[p_i/(1-p_i)] with the corresponding sampling variance v_i = 1/( n_i p_i (1-p_i) ).

With y_i and v_i, you can use standard meta-analytic methodology (if the observed proportions are close to 0 or 1, I would use the logit transformed proportions). You can fit the random-effects model, if you want to assume that the variability among the P_i values is entirely random (and normally distributed) and you are interested in making inferences about the expected value of P_i. Or you can try to account for the heterogeneity among the P_i values by examining the influence of moderators. 

You might find a function that I have written useful for this purpose. See:

http://www.wvbauer.com/downloads.html

Alternatively, you could fit a logistic regression model with a random intercept to these data (i.e., a generalized linear mixed-effects model). In other words, knowing p_i and n_i for each study, you actually have access to the raw data (consisting of 0's and 1's). This approach is essentially an "individual patient data meta-analysis". Such a model may or may not contain any moderators. You can find a discussion of this approach, for example, in: 

Whitehead (2002). Meta-analysis of controlled clinical trials. Wiley. 

Hope this helps,

-- 
Wolfgang Viechtbauer 
 Department of Methodology and Statistics 
 University of Maastricht, The Netherlands 
 http://www.wvbauer.com/ 

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Inman, Brant A. M.D.
Sent: Tuesday, March 06, 2007 00:56
To: r-help at stat.math.ethz.ch
Cc: Weigand, Stephen D.
Subject: [R] Mixed effects multinomial regression and meta-analysis

R Experts:

I am conducting a meta-analysis where the effect measures to be pooled are simple proportions.  For example, consider this  data from Fleiss/Levin/Paik's Statistical methods for rates and proportions (2003,
p189) on smokers:

Study	   N       Event P(Event)
 1       86       83    0.965
 2       93       90    0.968
 3       136     129    0.949
 4       82       70    0.854
Total    397     372    

A test of heterogeneity for a table like this could simply be Pearson' chi-square test.  
------

smoke.data <- matrix(c(83,90,129,70,3,3,7,12), ncol=2, byrow=F) chisq.test(smoke.data, correct=T)

> X-squared = 12.6004, df = 3, p-value = 0.005585

------

Now this test implies that the data is heterogenous and that pooling might be inappropriate. This type of analysis could be considered a fixed effects analysis because it assumes that the 4 studies are all coming from one underlying population.  But what if I wanted to do a mixed effects (fixed + random) analysis of data like this, possibly adjusting for an important covariate or two (assuming I had more studies, of course)...how would I go about doing it? One thought that I had would be to use a mixed effects multinomial logistic regression model, such as that reported by Hedeker (Stat Med 2003, 22: 1433), though I don't know if (or where) it is implemented in R.  I am certain there are also other ways...

So, my questions to the R experts are:

1) What method would you use to estimate or account for the between study variance in a dataset like the one above that would also allow you to adjust for a variable that might explain the heterogeneity?

2) Is it implemented in R?

Brant Inman
Mayo Clinic

______________________________________________
R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.