[BioC] Regression analysis problem

Fri Oct 7 19:57:34 CEST 2011

 Dear Kevin,

 How you treat the DNA methylation data will very much depend on the specific biological question you are trying to address. It is pretty clear
 that if you have a mixture of different cell types, each with a different methylation pattern, that small yet  biologically
 and clinically relevant changes in the cell type composition will lead to small yet statistically significant changes in methylation. 

The evidence for biologically relevant small changes in methylation (differences in means of ~0.1) is overwhelming, for instance in DNA methylation studies conducted on whole blood DNA where blood cell subtype composition changes in response to the presence of cancer.

 kind regards
 A.

***********************************************************************************************************************************************
Andrew E Teschendorff   PhD
Heller Research Fellow
Statistical Cancer Genomics
Paul O'Gorman Building
UCL Cancer Institute
University College London
72 Huntley Street
London WC1E 6BT, UK.

Mob: +44 07876 561263
Email: a.teschendorff at ucl.ac.uk
http://www.ucl.ac.uk/cancer/research-groups/statistical_cancer_genomics/index.htm
********************************************************************************************************************************************
________________________________________
From: bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] On Behalf Of Kevin R. Coombes [kevin.r.coombes at gmail.com]
Sent: 07 October 2011 18:26
To: ttriche at usc.edu
Cc: James W. MacDonald; Tim Triche,     Jr.; bioconductor at r-project.org; Sean Davis; khadeeja ismail
Subject: Re: [BioC] Regression analysis problem

To follow up on Tim's point:

I have yet to see any evidence that methylation data is anything other
than ternary. The typical interpretation for the beta values is
     less than 0.25 = fully unmethylated
     greater than 0.75 = fully methylated
     between 0.25 and 0.75 = partially methylated
Part of the evidence for this assertion is that we ahve several sets of
data from samples treated with drugs that should fully methylate (or
fully demethylate, respectively) everything. On those samples, 99% of
the observed values are above 0.75 (or below 0.25, respectively).

So I'm not convinced that t-tests have role to play in analyzing
genome-wide methylation data....

     Kevin

On 10/7/2011 10:39 AM, Tim Triche, Jr. wrote:
> On Fri, Oct 7, 2011 at 8:15 AM, James W. MacDonald<jmacdon at med.umich.edu>wrote:
>
>> First, by increasing the number of genes, you can more accurately
>> estimate an overall variance, which is then used in the eBayes() step to
>> 'shrink' your observed variance towards this overall variance. This is
>> one of the reasons that limma is so popular - by using information from
>> all genes, you can increase the power to detect differences in
>> individual genes.
>>
> Careful though -- this is methylation data, which tends to be strongly
> bimodal.  It's not clear that the assumption of a common variance across
> unmethylated, partially-methylated, and methylated sites is appropriate.  I
> seem to recall Gordon Smyth commenting upon this at one point -- perhaps
> he'll chime in.
>
>
>> Second, when you increase the number of pairs you are using, your power
>> to detect differences increases as well. This has nothing to do with
>> limma per se; it is just that the power of a t-test increases as you
>> increase the number of observations.
>>
> This is of course appropriate at any time that you can get more high-quality
> samples rather than fewer :-)
>

_______________________________________________
Bioconductor mailing list
Bioconductor at r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor