[BioC] reproducing dChip expression measure

Adaikalavan Ramasamy ramasamy at cancer.org.uk
Wed Apr 13 15:42:37 CEST 2005


Dear Naomi, thank you for the response. Please see my response.


On Mon, 2005-04-11 at 19:14 -0400, Naomi Altman wrote:
> I think you will find that any 2 reasonable Affy normalization methods have 

I am comparing the same expression measure (li-wong) but by two
different softwares (dChip and BioConductor).

> very high correlation.  In the Irizarry et al paper on cross-lab and 
> cross-platform comparisons this is called the "probe effect" and is due to 
> the fact that the range of expression values is huge and the normalization 
> methods do a reasonable job of preserving the ordering.
> However, this correlation does not translate into much overlap in the set 
> of genes that are declared DE.

Very interesting paper indeed. Thank you for pointing out this. I will
need to read it more on it though.

> A better measure of closeness of the 2 normalizations is the MA plot of the 
> normalized values on the same array, using the 2 normalizations.

The MA plot is simply 45 degree rotation of the scatter plots, so I
prefer to look at the scatterplots directly. True, I should have done
the scatterplot on an array-by-array basis but I am not too keen on
looking at 48 (= 12 arrays x 4 ways ) plots.

> Incidentally, I have never used the Li-Wong method, but I understand that 
> it requires a fairly large data set (i.e. arrays/condition), so the 
> differences between dChip and BioC may just be failure to converge.

Very good point. I did not even consider this. I wonder how the stable
expression measures is under different runs within R itself.

> --Naomi
> 
> At 11:01 AM 4/7/2005, Adaikalavan Ramasamy wrote:
> >I am trying to reproduce the dChip expression measure from the dChip
> >software with BioConductor packages. I am aware that dChip is not open
> >source but I would like to get as close as I can. Thus, I compare the
> >dChip expression measure from both softwares applied on a small datasets
> >of 12 arrays with approximately 16000 probesets.
> >
> >Going through mailing archive I found that I can use the following
> >combinations of values for parameters to feed through expresso
> >
> >         model   pmcorrect.method   bgcorrect.method
> >         1        "pmonly"            "none"
> >         2        "subtractmm"       "none"
> >         3         "pmonly"           "mas"
> >         4         "subtractmm"       "mas"
> >
> >with the following generic incantation to expresso :
> >
> >   expresso( ReadAffy(), normalize.method="invariantset",
> >             bgcorrect.method=???, pmcorrect.method=???,
> >             summary.method="liwong"
> >           )
> >
> >
> >The correlation of the values are high and similar ( around 0.90 ). I
> >ahve attached both the scatterplot and hexbin of expression measures
> >from these two softwares under different models with the line of
> >identity in red. It suggests that :
> >
> >a) Majority of the values are concentrated in the lower regions
> >b) The appears to be highly correlated values at higher end but they are
> >not perfectly identical
> >c) the MM subtracted data gives more dis-agreement at lower range but
> >much closer to line of identity at higher range
> >d) mas5 background correction does not appear to make much difference
> >
> >
> >Can other members of the list comment on
> >a) if they seen similar findings
> >b) if these results are expected and sensibility
> >c) what else can I try to increase the reproducibility
> >
> >
> >Eventually I plan on applying BioConductor's version of dChip expression
> >measure to few other datasets, so it would be useful to use the most
> >reproducible version from BioConductor.
> >
> >Thank you very much.
> >
> >Regards, Adai
> >
> >_______________________________________________
> >Bioconductor mailing list
> >Bioconductor at stat.math.ethz.ch
> >https://stat.ethz.ch/mailman/listinfo/bioconductor
> 
> Naomi S. Altman                                814-865-3791 (voice)
> Associate Professor
> Bioinformatics Consulting Center
> Dept. of Statistics                              814-863-7114 (fax)
> Penn State University                         814-865-1348 (Statistics)
> University Park, PA 16802-2111
> 
>



More information about the Bioconductor mailing list