[BioC] Analysing DNA methylation microarrays in Bioconductor

Paul Geeleher paulgeeleher at gmail.com
Fri Jul 23 21:45:10 CEST 2010


I see exactly what you mean.

The fact that these are CpG island arrays should also hopefully mean
that adjacent reporters will be showing methylation/unmethylation in
the same direction which should help to lock down the important
genomic regions. Thanks for the food for thought!

Paul.


On Fri, Jul 23, 2010 at 8:24 PM, Sean Davis <sdavis2 at mail.nih.gov> wrote:
> Hi, Paul.
> Thinking of methylation as a "black or white" affair might make sense for an
> individual cell or, perhaps, a perfectly homogeneous pool of cells (which
> probably does not exist), but from a tissue, I'm not sure that it is
> possible to think of methylation measurements that way.  What you are
> measuring is the aggregation of methylation profiles associated with
> potentially different methylation states in the tissue pool; this could
> certainly result in a fully continuous measure of methylation.  Therefore,
> finding statistical differences is still probably a useful way to think of
> the problem (though not the only one, obviously).  Just like for gene
> expression, a statistically significantly result does not imply a
> biologically important result, so you may want to stipulate a further filter
> that the difference between your two groups pass some arbitrary threshold.
> Sean
>
> On Fri, Jul 23, 2010 at 1:16 PM, Paul Geeleher <paulgeeleher at gmail.com>
> wrote:
>>
>> Interesting. I'm not sure it'd make sense to use expression values
>> (log ratios I assume) because while there might be a statistically
>> significant difference between the expression levels in each of the
>> phenotypes, that doesn't necessarily imply that the reporters are
>> methylated in one phenotype and unmethylated in the other if you see
>> what I mean?
>>
>> I'm assuming in the second case you are refering to a p-value for to
>> the probability of methylation of each reporter. Maybe this makes more
>> sense, but I think you still need one phenotype to have high
>> probabilty of methylation and the other phenotype to have high
>> probability of unmethylation, along with a statistically significant
>> difference in the p-values between the phenotypes?
>>
>> Paul.
>>
>> On Fri, Jul 23, 2010 at 8:02 PM, Sean Davis <sdavis2 at mail.nih.gov> wrote:
>> >
>> >
>> > On Fri, Jul 23, 2010 at 12:51 PM, Paul Geeleher <paulgeeleher at gmail.com>
>> > wrote:
>> >>
>> >> Thanks for the replies guys,
>> >>
>> >> Sean, we have 5 disease samples and 5 control samples. Each array has
>> >> 244k reporters located in CpG islands, averaging about 8 reporters per
>> >> CpG island.
>> >>
>> >
>> > So, why not generate a 10 x 244k matrix or 10 x 30k matrix if you
>> > summarize
>> > over CpG island and then apply a hypothesis test of your choice (which
>> > might
>> > need to be nonparametric, even) to the data?  The value associated with
>> > each
>> > probe per sample could be either a raw value (after "appropriate
>> > normalization") or it could be derived from a number of ChIP-chip like
>> > analysis packages (ACME, tilingarray, etc.).
>> > Sean
>> >
>> >>
>> >> Jinyan, doesn't MEDME require some kind of calibration experiment?
>> >> Needless to say this hasn't been done and it's unlikely that there is
>> >> money there to do it.
>> >>
>> >> Paul.
>> >>
>> >> On Fri, Jul 23, 2010 at 7:02 PM, Sean Davis <sdavis2 at mail.nih.gov>
>> >> wrote:
>> >> > Hi, Paul.  How many samples do you have?  And what are the sizes of
>> >> > the
>> >> > groups?
>> >> >
>> >> > It seems to me that you have for each probe a number.  You could do
>> >> > probewise testing between groups, or you could do some summarization
>> >> > first
>> >> > and then hypothesis testing.  In any case, there are a number of ways
>> >> > to
>> >> > arrive at an n x p matrix where standard statistical tools could be
>> >> > used.
>> >> >
>> >> > Sean
>> >> >
>> >> > On Jul 23, 2010 11:54 AM, "Paul Geeleher" <paulgeeleher at gmail.com>
>> >> > wrote:
>> >> >
>> >> > I understand your approach but the main problem I'd see with such a
>> >> > thresholding approach is that you are highly likely to find regions
>> >> > that are just below the cutoff to be called "methylated" in one
>> >> > phenotype and just above the threshold in the other phenotype. Thus
>> >> > most likely not differentially methylated at all. Do you see what I
>> >> > mean?
>> >> >
>> >> > Perhaps some kind of approach that labels each reporter as having a
>> >> > probability of methylation (and hence a probability of
>> >> > unmethylation),
>> >> > which can be compared across samples of a given phenotype to give a
>> >> > probability of all reporters being methylated/unmethylated in each
>> >> > phenotype, then compares these probabilities between phenotypes to
>> >> > give a probability of "differential methylation". That's just off the
>> >> > top of my head, I think it makes sense, but I'm presuming nothing
>> >> > like
>> >> > that has actually been implemented?
>> >> >
>> >> > Paul.
>> >> >
>> >> > On Fri, Jul 23, 2010 at 6:45 PM, Steve Lianoglou
>> >> > <mailinglist.honeypot at gmail.com> wrote:
>> >> >> Hi,
>> >> >>
>> >> >> ...
>> >> >
>> >> > --
>> >> > Paul Geeleher
>> >> > School of Mathematics, Statistics and Applied Mathematics
>> >> > National University of I...
>> >> >
>> >> > Bioconductor mailing list
>> >> > Bioconductor at stat.math.ethz.ch
>> >> > https://stat.ethz.ch/mailman/listinfo/bioco...
>> >>
>> >>
>> >>
>> >> --
>> >> Paul Geeleher
>> >> School of Mathematics, Statistics and Applied Mathematics
>> >> National University of Ireland
>> >> Galway
>> >> Ireland
>> >> --
>> >> www.bioinformaticstutorials.com
>> >>
>> >> _______________________________________________
>> >> Bioconductor mailing list
>> >> Bioconductor at stat.math.ethz.ch
>> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >> Search the archives:
>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >
>> >
>>
>>
>>
>> --
>> Paul Geeleher
>> School of Mathematics, Statistics and Applied Mathematics
>> National University of Ireland
>> Galway
>> Ireland
>> --
>> www.bioinformaticstutorials.com
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



-- 
Paul Geeleher
School of Mathematics, Statistics and Applied Mathematics
National University of Ireland
Galway
Ireland
--
www.bioinformaticstutorials.com



More information about the Bioconductor mailing list