[BioC] reproducing dChip expression measure
naomi at stat.psu.edu
Tue Apr 12 01:14:46 CEST 2005
I think you will find that any 2 reasonable Affy normalization methods have
very high correlation. In the Irizarry et al paper on cross-lab and
cross-platform comparisons this is called the "probe effect" and is due to
the fact that the range of expression values is huge and the normalization
methods do a reasonable job of preserving the ordering.
However, this correlation does not translate into much overlap in the set
of genes that are declared DE.
A better measure of closeness of the 2 normalizations is the MA plot of the
normalized values on the same array, using the 2 normalizations.
Incidentally, I have never used the Li-Wong method, but I understand that
it requires a fairly large data set (i.e. arrays/condition), so the
differences between dChip and BioC may just be failure to converge.
At 11:01 AM 4/7/2005, Adaikalavan Ramasamy wrote:
>I am trying to reproduce the dChip expression measure from the dChip
>software with BioConductor packages. I am aware that dChip is not open
>source but I would like to get as close as I can. Thus, I compare the
>dChip expression measure from both softwares applied on a small datasets
>of 12 arrays with approximately 16000 probesets.
>Going through mailing archive I found that I can use the following
>combinations of values for parameters to feed through expresso
> model pmcorrect.method bgcorrect.method
> 1 "pmonly" "none"
> 2 "subtractmm" "none"
> 3 "pmonly" "mas"
> 4 "subtractmm" "mas"
>with the following generic incantation to expresso :
> expresso( ReadAffy(), normalize.method="invariantset",
> bgcorrect.method=???, pmcorrect.method=???,
>The correlation of the values are high and similar ( around 0.90 ). I
>ahve attached both the scatterplot and hexbin of expression measures
>from these two softwares under different models with the line of
>identity in red. It suggests that :
>a) Majority of the values are concentrated in the lower regions
>b) The appears to be highly correlated values at higher end but they are
>not perfectly identical
>c) the MM subtracted data gives more dis-agreement at lower range but
>much closer to line of identity at higher range
>d) mas5 background correction does not appear to make much difference
>Can other members of the list comment on
>a) if they seen similar findings
>b) if these results are expected and sensibility
>c) what else can I try to increase the reproducibility
>Eventually I plan on applying BioConductor's version of dChip expression
>measure to few other datasets, so it would be useful to use the most
>reproducible version from BioConductor.
>Thank you very much.
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
Naomi S. Altman 814-865-3791 (voice)
Bioinformatics Consulting Center
Dept. of Statistics 814-863-7114 (fax)
Penn State University 814-865-1348 (Statistics)
University Park, PA 16802-2111
More information about the Bioconductor