[R] Meta-analysis on a repeated measures design with multiple trials per subject using metafor

Marc Heerdink m.w.heerdink at uva.nl
Fri Jul 5 12:15:55 CEST 2013

Dear Wolfgang and other readers of the r-help list,

Thank you very much for your suggestion. Unfortunately, the data that I 
have can not be described with a table such as the one you have made, 
because there's no identical trial under both treatment 1 and treatment 
2. To explain, let me explain a bit more about the experiments:

* All subjects were presented with the same number of trials
* Half of these trials were preceded by a prime from category 1 
(treatment 1) and half of these trials with a prime from category 2 
(treatment 2)
* Subjects were asked to respond to these trials (a unique stimulus for 
each trial) by pressing one of two keys on the keyboard.

Because everything was randomized, I can only calculate the total number 
of times a certain response was used under each type of trial. There is 
no pairing of trials under two treatments, so I am forced to use the 
marginal totals from your table.

I have uploaded a simplified version of the data for one experiment to 
illustrate this (the actual experiments have five treatments and some 
have moderators):

This is the script that I used to generate the data:

The problem thus appears to lie mainly in estimating the variance of the 
proportion difference from only the marginal totals, is that correct? Is 
there a way to calculate it from only the marginal totals?

One alternative that I have tried over the last few days, is to use the 
b parameter of interest and it's corresponding standard error from the 
lme4 regression output that I use to analyse the individual experiments. 
Then, I use rma(yi, sei) to do a meta-analysis on these parameters. I am 
not sure this is correct though, since it takes into account 
between-subjects variance (through a random effect for subject), and it 
is sensitive to the covariates/moderators I include in the models that I 
get the b parameters from.

Thanks again for your help, and for any suggestions for solving this 


On 07/04/2013 11:21 PM, Viechtbauer Wolfgang (STAT) wrote:
> Dear Marc,
> Let me see if I understand the type of data you have. You say that you have 5 experiments. And within each experiment, you have n subjects and for each subject, you have data in the form described in your post. Now for each subject, you want to calculate some kind of measure that quantifies how much more likely it was that subjects gave/chose response 2 under treatment 2 versus treatment 1. So, you would have n such values. And then you want to pool those values over the n subjects within a particular experiment and then ultimately over the 5 experiments. Is that correct so far?
> Assuming I got this right, let me ask you about those data that you have for each subject. In particular, are these paired data? In other words, is there are 1:1 relationship between the 30 trials under treatment 1 versus treatment 2? Or phrased yet another way, can you construct a table like this for every subject:
>                  trt 2
>               ------------
>               resp1 resp2
> trt 1 resp1  a     b      10
>        resp2  c     d      20
>               20    10     30
> Note that I added the marginal counts based on your example data, but this is not sufficient to reconstruct how often response 1 was chosen for the same trial under both treatment 1 and treatment 2 (cell "a"). And so on for the other 3 cells.
> If all of this applies, then essentially you are dealing with dependent proportions and you can calculate the difference y = (20/30)-(10/30) as you have done. The corresponding sampling variance can be estimated with v = var(y) = (a+b)*(c+d)/t^3 + (a+c)*(b+d)/t^3 - 2*(a*d/t^3 - b*c/t^3) (where t is the number of trials, i.e., 30 in the example above). See, for example, section 10.1.1. in Agresti (2002) (Categorical data analysis, 2nd ed.).
> So, ultimately, you will have n values of y and v for a particular experiment and then the same thing for all 5 experiments. You can then pool those values with rma(yi, vi) in metafor (yi and vi being the vectors of the y and v values). You probably want to add a factor to the model that indicates which experiment those values came from. So, something like: rma(yi, vi, mods = ~ factor(experiment)).
> Well, I hope that I understood your data correctly.
> Best,
> Wolfgang
> --
> Wolfgang Viechtbauer, Ph.D., Statistician
> Department of Psychiatry and Psychology
> School for Mental Health and Neuroscience
> Faculty of Health, Medicine, and Life Sciences
> Maastricht University, P.O. Box 616 (VIJV1)
> 6200 MD Maastricht, The Netherlands
> +31 (43) 388-4170 | http://www.wvbauer.com
> ________________________________________
> From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of Marc Heerdink [m.w.heerdink at uva.nl]
> Sent: Wednesday, July 03, 2013 2:15 PM
> To: r-help at r-project.org
> Subject: [R] Meta-analysis on a repeated measures design with multiple trials per subject using metafor
> Hi all,
> I am currently attempting to compile a summary of a series of five
> psychological experiments, and I am trying to do this using the metafor
> package. However, I am quite unsure which of the scenarios described in
> the metafor help pages applies to these data, because it is a repeated
> measures design, with multiple trials in each condition.
> Assume that for every participant, I have a basic contingency table such
> as this one:
>                  treatment
>                  1       2
> response
> 1               10      20
> 2               20      10
> (if this ASCII version does not work, I have 30 trials in each
> treatment, and participants give either response 1 or 2; the exact
> numbers don't matter)
> The problem that I am trying to solve is how to convert these numbers to
> an effect size estimate that I can use with metafor.
> As far as I understand it, I can only use it to get an effect size for
> outcomes that are dichotomous; i.e., either 1 or 0 for any subject.
> However, I have proportion data for every participant.
> I have considered and tried these strategies:
> 1. Base the effect size on within-participant proportion differences.
> That is, in the table above, the treatment effect would be
> (20/30)-(10/30) = 1/3; and I would take the M and SD of these values to
> estimate a study-level effect ("MN" measure in metafor).
> 2. Use the overall treatment * response contingency table, ignoring the
> fact that these counts come from different participants ("PHI" or "OR"
> measures in metafor). In a study with 10 participants, I would get cell
> counts around 150.
> However, from the research I've done into this topic, I know that 1) is
> not applicable to (as far as I understand) an odds ratio, and I suspect
> 2) overestimates the effect.
> A third method would be to use the regression coefficients, that I can
> easily obtain since I have all the raw data that I need. However, it is
> unclear to me whether and if yes, how I can use these in the metafor
> package.
>   From my understanding of another message about this topic I found on
> this list (1), I understand that having access to the raw data is an
> advantage, but I am not sure whether the scenario mentioned applies to
> my situation.
> 1:
> http://r.789695.n4.nabble.com/meta-analysis-with-repeated-measure-designs-td2252644.html
> I would very much appreciate any suggestions or hints on this topic.
> Regards,
> Marc
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Marc Heerdink, MSc. (PhD. candidate)
Dept. of Social Psychology
University of Amsterdam

More information about the R-help mailing list