[BioC] Yet another nested design in limma

Gordon K Smyth smyth at wehi.EDU.AU
Wed May 6 02:46:03 CEST 2009


Just a further note.  In my previous code, Batch and Line were not 
converted to factors.  This is ok here, because Batch has 2 levels and 
Line has only 2 levels within Phenotype.  In general though, you will find 
a design like this very much safer to analyse correctly if you re-number 
the nested factor 1:2 within each Phenotype.  Therefore,

   targets <- read.delim("targets.txt")
   Line <- 2 - (targets$Line %% 2)
   Line <- factor(Line)
   Batch <- factor(targets$Batch)
   Sex <- factor(targets$Sex)
   Phen <- factor(targets$Phen)
   design <- model.matrix(~Batch+Sex*(Phen/Line), data=targets)

   > colnames(design)
    [1] "(Intercept)"      "Batch2"           "SexM"
    [4] "PhenH"            "PhenL"            "PhenA:Line2"
    [7] "PhenH:Line2"      "PhenL:Line2"      "SexM:PhenH"
   [10] "SexM:PhenL"       "SexM:PhenA:Line2" "SexM:PhenH:Line2"
   [13] "SexM:PhenL:Line2"

etc

Best wishes
Gordon

On Wed, 6 May 2009, Gordon K Smyth wrote:

> Dear Paolo,
>
> As Naomi Altman as already told you, analysing an experiment such as this is 
> straightforward with limma.  I guess the problem you are having is that you 
> are trying to use the limma User's Guide's suggestion of forming a composite 
> factor out of the individual factors (called the group means 
> parametrization), and you don't know how to define contrasts for interactions 
> from this factor.  This does become a little more involved for experiments 
> with more factors.  Can I suggest that you instead make use of the factorial 
> formulae in R when you make up the design matrix, then you can probably 
> dispense with the contrast step altogether.
>
> You could for example use
>
>  targets <- read.delim("targets.txt")
>  design <- model.matrix(~Batch+Sex*(Phen/Line), data=targets)
>
> This will produce a design matrix with the following columns.
>
>  > colnames(design)
> [1] "(Intercept)"     "Batch"           "SexM"
> [4] "PhenH"           "PhenL"           "PhenA:Line"
> [7] "PhenH:Line"      "PhenL:Line"      "SexM:PhenH"
> [10] "SexM:PhenL"      "SexM:PhenA:Line" "SexM:PhenH:Line"
> [13] "SexM:PhenL:Line"
>
> To find genes significant for the sex x line interaction, you can simply use
>
>  fit <- lmFit(eset, design)
>  fit <- eBayes(fit)
>  topTable(fit, coef=9:13)
>
> On the other hand,
>
>  topTable(fit, coef=9:10)
>
> is the sex x phen interaction.
>
> Finally, you can add the biolrep as a random effect using the 
> duplicateCorrelation() function with block argument, as explained in the 
> limma User's Guide, but I am not convinced yet that this is absolutely 
> necessary for your experiment.
>
> Can I also suggest that you stroll over to the mathematics department at 
> Uppsala and talk to someone interested in bioinformatics and microarray 
> analysis, say Professor Tom Britton, and see if you can get ongoing help with 
> statistics and design issues.
>
> Best wishes
> Gordon
>
>
>> Date: Mon, 04 May 2009 14:09:14 +0200
>> From: Paolo Innocenti <paolo.innocenti at ebc.uu.se>
>> Subject: Re: [BioC] Yet another nested design in limma
>> Cc: AAA - Bioconductor <bioconductor at stat.math.ethz.ch>
>> 
>> Hi all,
>> 
>> since I received a few emails in my mailbox by people interested in a
>> solution for this design (or a design similar to this one), but there is
>> apparently no (easy) solution in limma, I was wondering if anyone could
>> suggest a package for differential expression analysis that allows
>> dealing with:
>> 
>> nested designs,
>> random effects,
>> multiple factorial designs with more than 2 levels.
>> 
>> I identified siggenes, maanova, factDesign that could fit my needs, but
>> I would like to have a comment by someone with more experience before
>> diving into a new package.
>> 
>> Best,
>> paolo
>> 
>> 
>> 
>> Paolo Innocenti wrote:
>>> Hi Naomi and list,
>>> 
>>> some time ago I asked a question on how to model an experiment in limma.
>>> I think I need some additional help with it as the experiment grew in
>>> complexity. I also added a factor "batch" because the arrays were run in
>>> separate batches, and I think would be good to control for it.
>>> The dataframe with phenotypic informations ("dummy") looks like this:
>>> 
>>> >>         Phen    Line    Sex     Batch     BiolRep
>>> >> File1   H       1       M       1         1
>>> >> File2   H       1       M       1         2
>>> >> File3   H       1       M       2         3
>>> >> File4   H       1       M       2         4
>>> >> File5   H       1       F       1         1
>>> >> File6   H       1       F       1         2
>>> >> File7   H       1       F       2         3
>>> >> File8   H       1       F       2         4
>>> >> File9   H       2       M       1         1
>>> >> File10  H       2       M       1         2
>>> >> File11  H       2       M       2         3
>>> >> File12  H       2       M       2         4
>>> >> File13  H       2       F       1         1
>>> >> File14  H       2       F       1         2
>>> >> File15  H       2       F       2         3
>>> >> File16  H       2       F       2         4
>>> >> File17  L       3       M       1         1
>>> >> File18  L       3       M       1         2
>>> >> File19  L       3       M       2         3
>>> >> File20  L       3       M       2         4
>>> >> File21  L       3       F       1         1
>>> >> File22  L       3       F       1         2
>>> >> File23  L       3       F       2         3
>>> >> File24  L       3       F       2         4
>>> >> File25  L       4       M       1         1
>>> >> File26  L       4       M       1         2
>>> >> File27  L       4       M       2         3
>>> >> File28  L       4       M       2         4
>>> >> File29  L       4       F       1         1
>>> >> File30  L       4       F       1         2
>>> >> File31  L       4       F       2         3
>>> >> File32  L       4       F       2         4
>>> >> File33  A       5       M       1         1
>>> >> File34  A       5       M       1         2
>>> >> File35  A       5       M       2         3
>>> >> File36  A       5       M       2         4
>>> >> File37  A       5       F       1         1
>>> >> File38  A       5       F       1         2
>>> >> File39  A       5       F       2         3
>>> >> File40  A       5       F       2         4
>>> >> File41  A       6       M       1         1
>>> >> File42  A       6       M       1         2
>>> >> File43  A       6       M       2         3
>>> >> File44  A       6       M       2         4
>>> >> File45  A       6       F       1         1
>>> >> File46  A       6       F       1         2
>>> >> File47  A       6       F       2         3
>>> >> File48  A       6       F       2         4
>>> 
>>> In total I have
>>> Factor "Phen", with 3 levels
>>> Factor "Line", nested in Phen, 6 levels
>>> Factor "Sex", 2 levels
>>> Factor "Batch", 2 levels
>>> 
>>> I am interested in:
>>> 
>>> 1) Effect of sex (M vs F)
>>> 2) Interaction between "Sex" and "Line" (or "Sex" and "Phen")
>>> 
>>> Now, I can't really come up with a design matrix (not to mention the
>>> contrast matrix).
>>> 
>>> Naomi Altman wrote:
>>>> You can design this in limma quite readily.  Nesting really just means
>>>> that only a subset of the possible contrasts are of interest.  Just
>>>> create the appropriate contrast matrix and you are all set.
>>> 
>>> I am not really sure with what you mean here. Should I treat all the
>>> factors as in a factorial design?
>>> I might do something like this:
>>> 
>>> phen <- factor(dummy$Phen)
>>> line <- factor(dummy$Line)
>>> sex <- factor(dummy$Sex)
>>> batch <- factor(dummy$Batch)
>>> fact <- factor(paste(sex,phen,line,sep="."))
>>> design <- model.matrix(~ 0 + fact + batch)
>>> colnames(design) <- c(levels(fact), "batch2")
>>> fit <- lmFit(dummy.eset,design)
>>> contrast <- makeContrasts(
>>>         sex= (F.H.1 + F.H.2 + F.L.3 + F.L.4 + F.A.5 + F.A.6) - (M.H.1 +
>>> M.H.2 + M.L.3 + M.L.4 + M.A.5 + M.A.6),
>>>         levels=design)
>>> fit2 <- contrasts.fit(fit,contrast)
>>> fit2 <- eBayes(fit2)
>>> 
>>> In this way I can correctly (I presume) obtain the effect of sex, but
>>> how can I get the interaction term between sex and line?
>>> I presume there is a "easy" way, but I can't see it...
>>> 
>>> Thanks,
>>> paolo
>>> 
>>> 
>>>> 
>>>> --Naomi
>>>> 
>>>> At 12:08 PM 2/16/2009, Paolo Innocenti wrote:
>>>>> Hi all,
>>>>> 
>>>>> I have an experimental design for a Affy experiment that looks like
>>>>> this:
>>>>>
>>>>>         Phen    Line    Sex     Biol.Rep.
>>>>> File1   H       1       M       1
>>>>> File2   H       1       M       2
>>>>> File3   H       1       F       1
>>>>> File4   H       1       F       2
>>>>> File5   H       2       M       1
>>>>> File6   H       2       M       2
>>>>> File7   H       2       F       1
>>>>> File8   H       2       F       2
>>>>> File9   L       3       M       1
>>>>> File10  L       3       M       2
>>>>> File11  L       3       F       1
>>>>> File12  L       3       F       2
>>>>> File13  L       4       M       1
>>>>> File14  L       4       M       2
>>>>> File15  L       4       F       1
>>>>> File16  L       4       F       2
>>>>> 
>>>>> 
>>>>> This appears to be a slightly more complicated situation than the one
>>>>> proposed in the section 8.7 of the limma users guide (p.45) or by
>>>>> Jenny on this post:
>>>>> 
>>>>> https://stat.ethz.ch/pipermail/bioconductor/2006-February/011965.html
>>>>> 
>>>>> In particular, I am intersted in
>>>>> - Effect of "sex" (M vs F)
>>>>> - Interaction between "sex" and "phenotype ("line" nested)
>>>>> - Effect of "phenotype" in males
>>>>> - Effect of "phenotype" in females
>>>>> 
>>>>> Line should be nested in phenotype, because they are random "strains"
>>>>> that happened to end up in phenotype H or L.
>>>>> 
>>>>> Can I design this in limma? Is there a source of information about
>>>>> how to handle with this? In particular, can I design a single model
>>>>> matrix and then choose the contrasts I am interested in?
>>>>> 
>>>>> Any help is much appreciated,
>>>>> paolo
>>>>> 
>>>>> 
>>>>> --
>>>>> Paolo Innocenti
>>>>> Department of Animal Ecology, EBC
>>>>> Uppsala University
>>>>> Norbyv?gen 18D
>>>>> 75236 Uppsala, Sweden
>>>>> 
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at stat.math.ethz.ch
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>> 
>>>> Naomi S. Altman                                814-865-3791 (voice)
>>>> Associate Professor
>>>> Dept. of Statistics                              814-863-7114 (fax)
>>>> Penn State University                         814-865-1348 (Statistics)
>>>> University Park, PA 16802-2111
>>>> 
>>>> 
>>> 
>> 
>> -- 
>> Paolo Innocenti
>> Department of Animal Ecology, EBC
>> Uppsala University
>> Norbyv?gen 18D
>> 75236 Uppsala, Sweden
>



More information about the Bioconductor mailing list