[BioC] reproducing dChip expression measure
ramasamy at cancer.org.uk
Wed Apr 13 15:42:37 CEST 2005
Dear Naomi, thank you for the response. Please see my response.
On Mon, 2005-04-11 at 19:14 -0400, Naomi Altman wrote:
> I think you will find that any 2 reasonable Affy normalization methods have
I am comparing the same expression measure (li-wong) but by two
different softwares (dChip and BioConductor).
> very high correlation. In the Irizarry et al paper on cross-lab and
> cross-platform comparisons this is called the "probe effect" and is due to
> the fact that the range of expression values is huge and the normalization
> methods do a reasonable job of preserving the ordering.
> However, this correlation does not translate into much overlap in the set
> of genes that are declared DE.
Very interesting paper indeed. Thank you for pointing out this. I will
need to read it more on it though.
> A better measure of closeness of the 2 normalizations is the MA plot of the
> normalized values on the same array, using the 2 normalizations.
The MA plot is simply 45 degree rotation of the scatter plots, so I
prefer to look at the scatterplots directly. True, I should have done
the scatterplot on an array-by-array basis but I am not too keen on
looking at 48 (= 12 arrays x 4 ways ) plots.
> Incidentally, I have never used the Li-Wong method, but I understand that
> it requires a fairly large data set (i.e. arrays/condition), so the
> differences between dChip and BioC may just be failure to converge.
Very good point. I did not even consider this. I wonder how the stable
expression measures is under different runs within R itself.
> At 11:01 AM 4/7/2005, Adaikalavan Ramasamy wrote:
> >I am trying to reproduce the dChip expression measure from the dChip
> >software with BioConductor packages. I am aware that dChip is not open
> >source but I would like to get as close as I can. Thus, I compare the
> >dChip expression measure from both softwares applied on a small datasets
> >of 12 arrays with approximately 16000 probesets.
> >Going through mailing archive I found that I can use the following
> >combinations of values for parameters to feed through expresso
> > model pmcorrect.method bgcorrect.method
> > 1 "pmonly" "none"
> > 2 "subtractmm" "none"
> > 3 "pmonly" "mas"
> > 4 "subtractmm" "mas"
> >with the following generic incantation to expresso :
> > expresso( ReadAffy(), normalize.method="invariantset",
> > bgcorrect.method=???, pmcorrect.method=???,
> > summary.method="liwong"
> > )
> >The correlation of the values are high and similar ( around 0.90 ). I
> >ahve attached both the scatterplot and hexbin of expression measures
> >from these two softwares under different models with the line of
> >identity in red. It suggests that :
> >a) Majority of the values are concentrated in the lower regions
> >b) The appears to be highly correlated values at higher end but they are
> >not perfectly identical
> >c) the MM subtracted data gives more dis-agreement at lower range but
> >much closer to line of identity at higher range
> >d) mas5 background correction does not appear to make much difference
> >Can other members of the list comment on
> >a) if they seen similar findings
> >b) if these results are expected and sensibility
> >c) what else can I try to increase the reproducibility
> >Eventually I plan on applying BioConductor's version of dChip expression
> >measure to few other datasets, so it would be useful to use the most
> >reproducible version from BioConductor.
> >Thank you very much.
> >Regards, Adai
> >Bioconductor mailing list
> >Bioconductor at stat.math.ethz.ch
> Naomi S. Altman 814-865-3791 (voice)
> Associate Professor
> Bioinformatics Consulting Center
> Dept. of Statistics 814-863-7114 (fax)
> Penn State University 814-865-1348 (Statistics)
> University Park, PA 16802-2111
More information about the Bioconductor