[BioC] Limma, model with several factors

Zadeh, Jenny Drnevich drnevich at illinois.edu
Thu Aug 8 20:28:00 CEST 2013


Hi Gordon,

I'm confused by your response to Ingrid last May as to how to construct a design matrix for her 2 factor experiment with repeated measures on 1 factor. I came across this post in searching the mailing list because I have a similar experimental design, but I was getting the dreaded "Coefficients not estimable" warning message with what I thought was the same design matrix you suggested to Ingrid. Your example below actually produces a singular design matrix:

#reproducible example:
> Subject <- gl(4,2)
> treatment <- gl(2,2,8)
> timepoint <- gl(2,1,8)
> Treat.Time <- factor(paste(treatment,timepoint,sep="."))
> design <- model.matrix(~0+Treat.Time+Subject)
> colnames(design)[1:4] <- levels(Treat.Time)
> design
  1.1 1.2 2.1 2.2 Subject2 Subject3 Subject4
1   1   0   0   0        0        0        0
2   0   1   0   0        0        0        0
3   0   0   1   0        1        0        0
4   0   0   0   1        1        0        0
5   1   0   0   0        0        1        0
6   0   1   0   0        0        1        0
7   0   0   1   0        0        0        1
8   0   0   0   1        0        0        1
attr(,"assign")
[1] 1 1 1 1 2 2 2
attr(,"contrasts")
attr(,"contrasts")$Treat.Time
[1] "contr.treatment"

attr(,"contrasts")$Subject
[1] "contr.treatment"

> 
> is.fullrank(design)
[1] FALSE
> 
> nonEstimable(design)
[1] "Subject4"

My searches of the mailing list led to a previous question also about the same experimental design, in which your response seems to indicate that you can't treat Subject as a fixed effect:
https://stat.ethz.ch/pipermail/bioconductor/2010-March/032523.html

Is this really true? I found this page (http://ww2.coastal.edu/kingw/statistics/R-tutorials/repeated.html) about modeling repeated measures factors in R, and it seems you could model a single gene like this:

#not-reproducible
> aov.out = aov(expression.level ~ treatment * timepoint + Error(Subject/timepoint), data=one.gene)

But this model formulation doesn't work with model.matrix(). So I'm confused about the best way to model my experiment: I have 11 subjects divided into 4 treatments with 3 timepoint measurements per subject and we're interested in treatment effects, time effects and interaction effects. I'm worried about simply treating Subject as a random effect because when I look separately within each treatment, only 1 of the 4 treatments actually has any substantial Subject effect! (Subject correlation from duplicateCorrelation() for the entire experiment is 0.038, but within each treatment the values are -0.036, -0.042, -0.041 and 0.110). If none of them had an observable Subject effect, I could just ignore it and do a standard 2-factor ANOVA design, but because only one treatment has an effect I'm worried the overall correlation value over-estimates the Subject effect in 3 treatments and under-estimates in the 1 treatment. Can I possibly drop some of the Subject coefficients from model.matrix(~0+Treat.Time+Subject) and only keep those pertaining to the treatment that does show Subject effects?

I would greatly appreciate any advice or suggestions!
Jenny

-----Original Message-----
From: bioconductor-bounces at r-project.org [mailto:bioconductor-bounces at r-project.org] On Behalf Of Gordon K Smyth
Sent: Monday, May 20, 2013 7:29 PM
To: Ingrid Dahlman [guest]
Cc: Bioconductor mailing list
Subject: [BioC] Limma, model with several factors

Dear Ingrid,

If you block on subject (as in Section 8.4 of the User's Guide), then you have automatically blocked on center as well, because the subjects are different in the two centers.

You experiment can be analyzed as suggested in Section 8.4.2 of the User's Guide.

First combine treatment and timepoint into one factor (following Section
8.5.2):

   Treat.Time <- factor(paste(treatment,timepoint,sep="."))

Then make a design matrix including the blocking factor:

   design <- model.matrix(~0+Treat.Time+Subject)
   colnames(design)[1:4] <- levels(Treat.Time)
   cont.matrix <- makeContrasts((F.A-F.B)-(P.A-P.B),levels=design)

etc.

The rule is that the experimental factors of interest are combined into the treatment factors, whereas the nuisance factors are blocked on.

Best wishes
Gordon

> Date: Mon, 20 May 2013 01:01:08 -0700 (PDT)
> From: "Ingrid Dahlman [guest]" <guest at bioconductor.org>
> To: bioconductor at r-project.org, ingrid.dahlman at ki.se
> Subject: [BioC] Limma, model with several factors
>
>
> I have carefully read the Limma users guide but still have not sorted 
> out how I design the contrasts for the following Project.
>
> In want to compare the effect (After vs before) of two different 
> treatments, F and P. The study was carried our in two different centers.
> This can be illustrated as follows:

> Subject center treatment timepoint
> 1            1         F          B
> 1            1         F          A
> 2            1         P          B
> 2            1         P          A
> 3            2         F          B
> 3            2         F          A
> 4            2         P          B
> 4            2         P          A

> I want to compare (F.A-F.B)-(P.A-P.B), blocking for subject. However, 
> in addition, I would like to block for center. I.e. the center is like 
> a batch effect.

> Is it possible to block for two factors, suject and center, in the 
> same test in Limma?

> /Ingrid
>
> -- output of sessionInfo():
>
> See section 8.7 in the user guide.
>
> --
> Sent via the guest posting facility at bioconductor.org.
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:9}}



More information about the Bioconductor mailing list