[BioC] Apply non-Gaussian distribution data to limma
James W. MacDonald
jmacdon at uw.edu
Tue Dec 10 15:04:48 CET 2013
The distributions you are talking about are within subject (e.g., the
distribution of the expression values for multiple genes from an
individual), but the comparisons you are making are between subjects.
It really doesn't matter what the within subject distributions look
like, unless you are concerned with intra-subject comparisons. The
between subject comparisons are usually based on far too few replicates
to get a real sense of the distribution, and you are usually making
thousands of such comparisons, so even if you could check the gene-wise
distributions you would then have to 'fix' them one by one.
Luckily, the t-statistic is pretty robust to non-normal distributed
data, so you (like thousands of people already) can just go ahead and
fit the model using limma. If you are really worried, you could use a
resistant regression but the cost is power, which most microarray
studies (at least in my experience) are lacking already.
On Monday, December 09, 2013 10:02:25 PM, Jiang [guest] wrote:
> I have high through array data with two peaks in its distribution (raw and log 2 transformed). I googled it looks like is mixed Gaussian distribution - two normal distribution as some people suggested. I think limma's assumption is normal distribution. I was wondering if there is any way to fix the problem or to convert my data to normal distribution before applying to limma.
> -- output of sessionInfo():
> see question
> Sent via the guest posting facility at bioconductor.org.
> Bioconductor mailing list
> Bioconductor at r-project.org
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
James W. MacDonald, M.S.
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor