[BioC] A question about Limma

Gordon K Smyth smyth at wehi.EDU.AU
Thu Jan 6 23:35:06 CET 2005


I agree.  In my reply to Fangxin I should have added that I would remove a non-essential effect
like  a dye-effect if it appeared non-significant, but I'd remove it for all the genes.

Gordon

On Tue, January 4, 2005 1:18 am, Naomi Altman said:
> Reducing the model based on removing nonsignificant effects is called
> "pre-test estimation".  It is known to increase the false-positive rate,
> even in the classical setting.  In the microarray setting, there is no
> compelling reason to use pre-test estimators that differ from gene to gene.
>
> --Naomi Altman
>
> At 10:57 PM 1/3/2005 +1100, Gordon K Smyth wrote:
>> > Date: Sun, 2 Jan 2005 14:05:15 -0800 (PST)
>> > From: "Fangxin Hong" <fhong at salk.edu>
>> > Subject: [BioC] A question about Limma
>> > To: bioconductor at stat.math.ethz.ch
>> > Message-ID: <1867.66.75.240.64.1104703515.squirrel at 66.75.240.64>
>> > Content-Type: text/plain;charset=iso-8859-1
>> >
>> > Hi Bioconductor users;
>> > I have a general question about limma model.
>> > In limma package, usually one linear model applies to all genes, and error
>> > variances from all genes are modified simultaneously. What if some
>> > factors, for example, one main effect, is only significant for some genes.
>> > Then if we want identify genes based on the significance of another main
>> > effect (of interest). What is the best way to do it? Currently I juse
>> > leave this factor in the model which is applied to all genes,
>>
>>That's what I do, leave all terms in the models for all the genes.  I
>>don't see a strong case for
>>doing a separate model selection process for every gene.
>>
>> > but this
>> > might under-estimate the total number of genes on which the effect of
>> > interest is significant.
>>
>>Why do you think so?  The only disadvantage of keeping a non-significant
>>term in the model is a
>>reduction in residual degrees of freedom, with some consequential loss of
>>power, but this
>>disadvantage is mitigated by the empirical Bayes moderation process.
>>
>>Perhaps someday someone will work out a model selection theory for
>>massively parallel regression
>>situations like microarray experiments, but there isn't such a theory
>>now.  It seems safer to me
>>to have the same model for every gene, keeping all the 'a priori'
>>important predictors in the
>>model.
>>
>>Gordon
>>
>> > I am sorry if this question has been asked/answered here before, I
>> > wouldn't find it through searching the archive. Any comment, suggestion or
>> > experience is appreciated.
>> >
>> > Fangxin
>> > --
>> > Fangxin Hong, Ph.D.
>> > Plant Biology Laboratory
>> > The Salk Institute
>> > 10010 N. Torrey Pines Rd.
>> > La Jolla, CA 92037
>> > E-mail: fhong at salk.edu
>>
>>_______________________________________________
>>Bioconductor mailing list
>>Bioconductor at stat.math.ethz.ch
>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>
> Naomi S. Altman                                814-865-3791 (voice)
> Associate Professor
> Bioinformatics Consulting Center
> Dept. of Statistics                              814-863-7114 (fax)
> Penn State University                         814-865-1348 (Statistics)
> University Park, PA 16802-2111
>



More information about the Bioconductor mailing list