[BioC] simultaneous use of robust and weighting methods in limma

Gordon K Smyth smyth at wehi.EDU.AU
Fri Dec 13 05:08:22 CET 2013


Hi Richard,

In principle, they can be used in any combination, but the effectiveness 
of this awaits careful testing.  I would personally be reluctant to use 
lmFit(method="robust") with the other methods just because I don't trust 
the variance estimators from the MM regression that much.

lmFit(method="robust") is designed to deal with individual expression 
values as outliers.  arrayWeights() is designed to deal with outlier 
arrays.  eBayes(robust=TRUE) is designed to deal with outlier 
(hypervariable) genes.  So the first is observation based, the second is 
array based, and the third is gene based.  Rather than trying all 
combinations, I would be guided by the scientific context and what type of 
aberration seems of high risk.  Outlier arrays typically arise when RNA 
samples vary markedly in quality, and this is common in human clinical 
studies when RNA is hard to get.  Outlier genes typically arise when a 
minority of genes are affected by a hidden covariate or batch effect.

lmFit(method="robust") has been in limma since the earliest days, but it 
hasn't been used so much in practice.  This may be because microarrays 
have a limited dynamic range and so don't tend to show dramatic 
single-observation outliers.  (RNA-seq may prove to be different.)  Or it 
might be because the least squares approach on the log-scale is pretty 
robust anyway.

Most people might be familiar with robust methods as a way to add 
protection against outliers, but array or gene outliers tend to produce 
conservative results in the limma pipeline anyway.  The major purpose of 
arrayWeights and eBayes(robust=TRUE) is to recover statistical power in 
the presence of poor data, without having to make ad hoc judgements about 
which poorer quality arrays or probes to remove from the analysis.

Best wishes
Gordon

> Date: Wed, 11 Dec 2013 13:14:24 -0500
> From: Richard Friedman <friedman at c2b2.columbia.edu>
> To: "bioconductor at r-project.org list" <bioconductor at r-project.org>
> Subject: [BioC] simultaneous use of robust and weighting methods in
> 	limma.
>
> Dear List,
>
> Should arrayweights,  eBayes(robust=TRUE), and lmFit(...,method="robust")
> be used simulatanenously in Limma? If not should any combination be used
> together?
>
> Thanks and best wishes,
> Rich
> Richard A. Friedman, PhD
> Associate Research Scientist,
> Biomedical Informatics Shared Resource
> Herbert Irving Comprehensive Cancer Center (HICCC)
> Lecturer,
> Department of Biomedical Informatics (DBMI)
> Educational Coordinator,
> Center for Computational Biology and Bioinformatics (C2B2)/
> National Center for Multiscale Analysis of Genomic Networks (MAGNet)/
> Columbia Department of Systems Biology
> Room 824
> Irving Cancer Research Center
> Columbia University
> 1130 St. Nicholas Ave
> New York, NY 10032
> (212)851-4765 (voice)
> friedman at c2b2.columbia.edu
> http://friedman.c2b2.columbia.edu/
>
> In memoriam, Frederik Pohl

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list