[BioC] normalization and analysis of connected designs

Thu Jul 3 15:06:14 MEST 2003

Dear Gordon,

Thank you very much for your comments and discussion of Wolfgang's message, 
and for clarifying some issues (about Wolfinger's and Churchill's approaches) 
which I though I understood, but I didn't. Thanks a lot for the reference, 
too.

Interesting about Churchill's approach, though, is that his paper in Nature 
Genetics (2002, 32: 490-495) makes all comparisons either within-array or 
using connected designs and, for example one of his papers with K. Kerr (Kerr 
& Churchill, 2001, Biostatistics, 2: 183-201) says explicitly that "in order 
to fit models such as (4.1), (4.2) and (4.3) a design should be connected" 
(p. 8 of the technical report; 4.1 to 4.3 are the usual ANOVA models of the 
Churchill group).

I'll have to do some more reading. This is getting a lot more messy than I 
thought.

Best,

Ramón

On Thursday 03 July 2003 05:27, Gordon Smyth wrote:
> Dear Ramon,
>
> I am with you. The direct comparison design you describe is a very sensible
> type of design which is intended to compare RNA samples using within-spot
> comparisons, i.e., log-ratios or M-values. The limma package in
> Bioconductor is specifically designed to analyse experiments of this type.
> You're quite correct that you do need a connect design in order to compare
> all the RNA types in this way.
>
> Wolfgang is arguing for what in my lab we call a 'single-channel analysis'.
> The main proponents of single-channel analysis in the literature are Rus
> Wolfinger at SAS and Gary Churchhill at the Jax lab. As far as I am aware
> there is no software in Bioconductor designed to do single-channel analysis
> of cDNA arrays. We (I mean here Jean, Sandrine who wrote the marray
> packages and I) don't yet provide single-channel software because we
> consider it to be an experimental methodology whose validity is still be
> established. Normalization of single-channel data in particular is
> something that we are still trying to do a satisfactory job of. The only
> discussion of single-channel normalization for cDNA data that I am aware of
> in the literature is Yang and Thorne, see below.
>
> Wolfinger and Churchhill fit mixed linear models in which a spot is a
> random effect. One then has multiple error strata corresponding to spots,
> to individual channel intensities within spots and perhaps to arrays as
> well. There are certainly cases where one can get more information out of
> this approach than analysing arrays entirely using log-ratios.
> Statistically, the method consists of using random effects to recover
> information from the between-spot error strata. The real problem is to know
> when it is valid to take this approach and when it is not.
>
> I may have misinterpreted Wolfgang, but he does seem to be proposing a even
> more radical approach in which the spot error strata is ignored entirely.
> (I think that is only way one could get the calculation that the D-A
> variance is reduced by a third.) This is more radical than anything I've
> seen in the literature, and I don't personally think it would be a good
> approach for cDNA microarray data.
>
> Regards
> Gordon
>
> Yang, Y. H., and Thorne, N. P. (2003). Normalization for two-color cDNA
> microarray data. In: D. R. Goldstein (ed.), Science and Statistics: A
> Festschrift for Terry Speed, IMS Lecture Notes - Monograph Series, Volume
> 40, pp. 403-418.
>
> At 03:27 AM 3/07/2003, w.huber at dkfz-heidelberg.de wrote:
> >Hi Ramon,
> >
> >What makes the difference between D and A hybridized on the same array,
> >and on different arrays? It is (a) the between-array variation (e.g.
> >because each time the spotter puts down a drop of DNA it is a a little bit
> >different, or because the arrays had different surface treatments, etc.),
> >and (b) the between-hybridization variation (e.g. different temperatures,
> >different volumes of the reaction chamber). These two sources of variation
> >need to be compared to others sources, e.g. (c) between-RNA-extraction,
> >(d) between-reverse-transcription, (e) between-labeling, (f) between-dyes.
> >(c)-(f) are present no matter whether you D and A are on one array or on
> >different ones.
> >
> >That it is possible to make (a) and (b) small is shown by the fact that
> >useful results have been obtained through single-color arrays such as Affy
> >or Nylon membranes. Whether in your experiment (a) and (b) are small
> >compared to (c)-(f) depends on your particular experiment. If they are,
> >you are better of with h_3G - h_1R than with the full chain of summands. I
> >have seen examples where this seemed to be the case.
> >
> >Anyone else?
> >
> >Best regards
> >   Wolfgang
> >
> >
> >On Wed, 2 Jul 2003, Ramon Diaz-Uriarte wrote:
> >... [SNIP]
> >
> > > I am not sure I follow this. I understand that, __if__ D and A had been
> > > hybridized in the same array, then the variance of their comparison
> >
> > would be
> >
> > > a third of the variance of the comparison having to use the (two-step)
> > > connectiion between A and D. But I am not sure I see how we can
> > > directly do h_3G - h_1R
> > > (if this were possible, then, there would be no need to use connected
> > > designs.)
> > >
> > > ... [SNIP] ...
> > >
> > > So either way, I don't get to see how we can directly do
> > > h_3G - h_1R
> > >
> > > But then, maybe I am missing something obvious again...
> >
> >_______________________________________________
> >Bioconductor mailing list
> >Bioconductor at stat.math.ethz.ch
> >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

-- 
Ramón Díaz-Uriarte
Bioinformatics Unit
Centro Nacional de Investigaciones Oncológicas (CNIO)
(Spanish National Cancer Center)
Melchor Fernández Almagro, 3
28029 Madrid (Spain)
Fax: +-34-91-224-6972
Phone: +-34-91-224-6900

http://bioinfo.cnio.es/~rdiaz