[BioC] Two channel data vs. one colour data for PCA, heatmaps and clustering

Mayer, Claus-Dieter c.mayer at abdn.ac.uk
Mon Nov 14 15:21:47 CET 2011


One issue I found with using the single channels of a two-colour experiment in a multivariate visualisation technique (PCA plots, clustering, heatmap, etc) is that array or dye effects can mask the sources of variation you are mostly interested in. Strong array effects mean that the two channels from the same array cluster together, strong dye effects can result in the red and green channels forming two big groups (normalisation can only eliminate these effects to a certain extent).
In a PCA (or similar ordination method) it thus often makes sense to look at higher components (not only the first two).

Best Wishes


Dr. Claus-D. Mayer
Biomathematics & Statistics Scotland (BioSS)
Rowett Institute of Nutrition and Health
University of Aberdeen
Aberdeen AB21 9SB, Scotland, UK.
email: claus at bioss.ac.uk or c.mayer at abdn.ac.uk
Telephone: +44 (0) 1224 716652

Biomathematics and Statistics Scotland (BioSS) is formally part of The James Hutton Institute,
a  registered Scottish charity No. SC041796 and a company limited by guarantee No. SC374831

> -----Original Message-----
> From: bioconductor-bounces at r-project.org [mailto:bioconductor-
> bounces at r-project.org] On Behalf Of mjonczyk
> Sent: 14 November 2011 10:52
> To: arraystruggles at gmail.com; bioconductor at r-project.org
> Subject: Re: [BioC] Two channel data vs. one colour data for PCA,
> heatmaps and clustering
> Dear John,
> I suppose that for the two-colour experiment you have also "A" (average
> expression) values.
> I don't know what package you have used but limma has RG.MA function
> which transforms MA data to RG (i.e. unlogged intensities).
> So you could construct MAlist object from your data, transform it to
> RGlist,
> (maybe take a log2) and you will have data for both channels.
> HTH,
> Maciej Jończyk
> > Dear Bioconductor.
> > In the past I have produced some PCA plots and heatmaps using one
> > colour data. On the PCA, it is useful to separate out the different
> > sample groups using the normalised expression values (say normal
> > coloured green and treatment coloured red).
> >
> > However, this sort of analyses does not seem possible with two colour
> > as you have a sinlge log2 normalised ratio (M value) as input to PCA
> > and heatmap functions.
> >
> > Does anyone have experience of doing PCA and/or heatmaps with 2
> > colour
> > data? Any info/advice appreciated.
> >
> > John.
> --
> Maciej Jonczyk,
> Department of Plant Molecular Ecophysiology
> Faculty of Biology, University of Warsaw
> 02-096 Warsaw, Miecznikowa 1
> Poland
> --
> This email was Anti Virus checked by Astaro Security Gateway.
> http://www.astaro.com
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor

The University of Aberdeen is a charity registered in Scotland, No SC013683.

More information about the Bioconductor mailing list