[BioC] A question about paired data and how to set up the designmatrix in LIMMA
Robert Gentleman
rgentlem at fhcrc.org
Sun Jul 17 17:36:01 CEST 2005
Hi Johan,
The simple answer is that your data do not quite fit the paired
t-test model. You probably want some form of random effects model as can
be fit by lme (nlme pacakge) or lmer (Matrix package). Where patients
are treated as a random effect and before/after as a fixed effect. I do
not believe that limma fits these models, although it may be extended at
some time. And neither of those pieces of software (lme or lmer) fits an
empirical Bayes model to lots of arrays although they too could be
extended. Hence, there is no easy solution. Nor is fitting and
interpreting mixed effects models simple - if you have little
statistical experience, the best advice is to find someone with a lot
more who can help, as the analysis is not likely to be simple or
straightforward.
You can always take the expedient of discarding some of the data, so
that you are reduced to one before and one after sample per patient.
Then standard paired t-tests can be used. But you have lost data.
Best wishes,
Robert
Johan Lindberg wrote:
> Dear Bioconductor users.
> I posted a question a while ago that did not get an answer. I'm sorry to
> have to nag and ask the same question again but I really need an answer.
> The question that I'm asking is how to do the paired analysis in LIMMA?
> If you don’t have the time to look at my previous post here is a shorter
> version.
>
> I have tried to read previous posts on this matter but I found the
> answers kind of vague (perhaps I missed something due to my limited
> knowledge in statistics, in that case please guide me to an explaining
> post and I apologize for asking the same question again).
>
> I have affydata on 4 patients (paired) before and after treatment
> (unbalanced due to different number of biopsies from each patient). I
> want to fit the patients as a fixed effect. I am interested in the
> effect due to treatment.
>
> In previous posts I have seen Gordon give directions about two ways to
> fit patients as a fixed effect with LIMMA.
>
> http://files.protsuggest.org/biocond/html/8297.html
>
> and
>
> http://files.protsuggest.org/biocond/html/8175.html
>
> My situation look like this:
> #########################################################
> #Fitting the patients as a fixed effect, I am not totally shure of my
> #code, please correct me if I am wrong
> eset<-rma(Data)
> tmp<-pData(eset)
> tmp<-cbind(tmp,Before=c(1,0,1,1,1,0,1,0,1,0,1,0,1,0))
> tmp<-cbind(tmp,After=c(0,1,0,0,0,1,0,1,0,1,0,1,0,1))
> tmp<-cbind(tmp,Patient=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4))
> pData(eset)<-tmp
> design <- model.matrix(~(After - Before) + Patient,
> data=as.data.frame(tmp))
> fit<-lmFit(eset,design)
> fitModel <- eBayes(fit)
>
> #########################################################
> #Using the block argument to “block” patient.
> eset<-rma(Data)
> tmp<-pData(eset)
> tmp<-cbind(tmp,Before=c(1,0,1,1,1,0,1,0,1,0,1,0,1,0))
> tmp<-cbind(tmp,After=c(0,1,0,0,0,1,0,1,0,1,0,1,0,1))
> pData(eset) <- tmp
> design <- model.matrix(~(After - Before), data=pData(eset))
> cor<-duplicateCorrelation(eset,
> design,block=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4))
> cor$cor
> # [1] 0.2739284
> fit<-lmFit(eset, design, block=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4))
> fitcor <- eBayes(fit)
>
> #I also try this model using the default value, cor=0.75, when using
> lmFit.
> eset<-rma(Data)
> tmp<-pData(eset)
> tmp<-cbind(tmp,Before=c(1,0,1,1,1,0,1,0,1,0,1,0,1,0))
> tmp<-cbind(tmp,After=c(0,1,0,0,0,1,0,1,0,1,0,1,0,1))
> pData(eset) <- tmp
> design <- model.matrix(~(After - Before), data=pData(eset))
> fit<-lmFit(eset, design, block=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4))
> fitCorDefault <- eBayes(fit)
> fitCorDefault$cor
> #[1] 0.75
> #########################################################
>
> So my questions are, if my code is right, which way is the right way to
> fit patients as a fixed effect since the result differ (the lod-scores
> differ between the different models)? If my code is wrong, please
> correct it. And last, may I be bald enough to suggest (since there seem
> to be a lot of post of how to fit fixed effect in LIMMA) that it would
> be great if there were an example of how to fit fixed effects in the
> LIMMA users guide?
>
> Thank you for your time
>
> Best regards
>
> Johan Lindberg
>
>
> -----Original Message-----
> From: bioconductor-bounces at stat.math.ethz.ch
> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Johan
> Lindberg
> Sent: Monday, July 11, 2005 11:35 AM
> To: bioconductor at stat.math.ethz.ch
> Subject: [BioC] A question about paired data and how to set up the
> designmatrix in LIMMA
>
> Hi all.
>
> I have tried to read previous posts on this matter but I found the
> answers kind of vague (perhaps I missed something due to my limited
> knowledge in statistics, in that case please guide me to an explaining
> post and I apologize for asking the same question again).
>
> I have affydata on 4 patients (paired) before and after treatment
> (unbalance due to different number of biopsies from each patient). I
> want to fit the patients as a fixed effect. I am interested in the
> effect due to treatment.
>
> In previous posts I have seen Gordon give directions about two ways to
> fit patients as a fixed effect with LIMMA.
>
> http://files.protsuggest.org/biocond/html/8297.html
>
> and
>
> http://files.protsuggest.org/biocond/html/8175.html
>
> In the first post the t-statistic is according to Gordon suppose to be
> the corresponding paired t-statistic.
> But in the second mail he states “All forms of blocking are much the
> same from a mathematical point of view, whether they are technical
> replicates or blocking on patient.
> The direct extension of a paired t-test would actually arise from
> fitting patient as a fixed effect. If you did that, limma would compute
> for you paired t-tests with moderated denominators. If you use the
> 'block' argument instead of using a fixed effect, I would generally
> recommend that you estimate the common correction rather than take a
> preset value. However I suspect that, in this case, if you use a large
> correlation like 0.75 or higher the results might be very similar to
> using a fixed patient effect.”
>
> My situation look like this:
> #########################################################
> #Fitting the patients as a fixed effect, I am not totally shure of my
> code, please correct me if I am #wrong
> eset<-rma(Data)
> tmp<-pData(eset)
> tmp<-cbind(tmp,Before=c(1,0,1,1,1,0,1,0,1,0,1,0,1,0))
> tmp<-cbind(tmp,After=c(0,1,0,0,0,1,0,1,0,1,0,1,0,1))
> tmp<-cbind(tmp,Patient=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4))
> pData(eset)<-tmp
> design <- model.matrix(~(After - Before) + Patient,
> data=as.data.frame(tmp))
> fit<-lmFit(eset,design)
> fitModel <- eBayes(fit)
>
> #########################################################
> #Using the block argument to “block” patient.
> eset<-rma(Data)
> tmp<-pData(eset)
> tmp<-cbind(tmp,Before=c(1,0,1,1,1,0,1,0,1,0,1,0,1,0))
> tmp<-cbind(tmp,After=c(0,1,0,0,0,1,0,1,0,1,0,1,0,1))
> pData(eset) <- tmp
> design <- model.matrix(~(After - Before), data=pData(eset))
> cor<-duplicateCorrelation(eset,
> design,block=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4))
> cor$cor
> # [1] 0.2739284
> fit<-lmFit(eset, design, block=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4))
> fitcor <- eBayes(fit)
>
> # I also try this model using the default value, cor=0.75, when using
> lmFit.
> eset<-rma(Data)
> tmp<-pData(eset)
> tmp<-cbind(tmp,Before=c(1,0,1,1,1,0,1,0,1,0,1,0,1,0))
> tmp<-cbind(tmp,After=c(0,1,0,0,0,1,0,1,0,1,0,1,0,1))
> pData(eset) <- tmp
> design <- model.matrix(~(After - Before), data=pData(eset))
> fit<-lmFit(eset, design, block=c(1,1,1,1,1,1,2,2,3,3,3,3,4,4))
> fitCorDefault <- eBayes(fit)
> fitCorDefault$cor
> [1] 0.75
> #########################################################
>
> First of all, is my code right? I am kind of sceptic since Gordon states
> that “if you use a large correlation like 0.75 or higher the results
> might be very similar to using a fixed patient effect”. When I plot the
> B-scores of both models they differ somewhat in the ranking and in the
> number of genes with B-score >0. They look kind of the same but the
> B-scores is higher when I use the block argument to “block” patient
> using the default correlation.
>
> colnames(fitModel$lods)
> #[1] "(Intercept)" "After" "Patient"
> length(which(fitModel$lods[,2] > 0))
> #[1] 0
>
> fitcor$cor
> #[1] 0.2739284
> colnames(fitcor$lods)
> #[1] "(Intercept)" "After"
> length(which(fitcor$lods[,2] > 0))
> #[1] 4
>
> fitCorDefault$cor
> #[1] 0.75
> colnames(fitCorDefault$lods)
> #[1] "(Intercept)" "After"
> length(which(fitCorDefault$lods[,2] > 0))
> #[1] 109
>
> #################
> #To get a feeling of similarity between the ranking of the genes
> cor(fitcor$lods[,2], fitCorDefault$lods[,2],
> use="pairwise.complete.obs")
> [1] 0.9781474
> cor(fitcor$lods[,2], fitModel$lods[,2], use="pairwise.complete.obs")
> [1] 0.9753385
> cor(fitCorDefault$lods[,2], fitModel$lods[,2],
> use="pairwise.complete.obs")
> [1] 0.9610673
> ################
>
> So my questions are, if my code is right, which way is the right way to
> fit patients as a fixed effect since the result differ?
>
> If my code is wrong, please guide me of how to get it right.
>
> Thank you for your time
>
> Best regards
>
> Johan Lindberg
>
>
> ************************************************************************
> *******************
> Johan Lindberg
> Royal Institute of Technology
> AlbaNova University Center
> Stockholm Center for Physics, Astronomy and Biotechnology
> Department of Molecular Biotechnology
> 106 91 Stockholm, Sweden
>
> Phone (office) +46 8 553 783 44
> Fax + 46 8 553 784 81
> Visiting adress Roslagstullsbacken 21, Floor 3
> Delivery adress Roslagsvägen 30B
> http://www.biotech.kth.se/molbio/microarray/index.html
> ************************************************************************
> *******************
>
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
More information about the Bioconductor
mailing list