[BioC] normalization and analysis of connected designs

Ramon Diaz-Uriarte rdiaz at cnio.es
Fri Jul 25 12:22:54 MEST 2003


Dear Xavi,

Thanks for your last two emails. I see your point, but it is my understanding 
(which has improved a lot thanks to comments from Gordon Smith) that one of 
the important reasons for dealing with ratios in cDNA arrays is controlling 
spot-to-spot variation. In fact, this is mentioned explictly in several 
papers (e.g., Yang & Thorne, 2003, p. 405). So, regardless of the importance 
of competitive hybridization, spot-to-spot variation is always there.
I am not that familiar with Affy, but I think that, because their setup is 
very different (e.g., multiple probes per clone), the direct analogy "if we 
do it with Affy we ought to be able to do it with cDNA" does not really hold 
just like that.

By the way, the paper of Yang & Thorne, which Gordon Smith mentioned in a 
previous email, contains discussion of single-channel normalization for cDNA, 
and Natalie Thorne presented a very interested talk at the last RSS meetings 
dealing with single-channel normalization. However, if I understand 
correctly, there are still some issues that need to be investigated more 
fully for single channel normalization and they are working on it.

Yang, Y. H., and Thorne, N. P. (2003). Normalization for two-color cDNA 
microarray data. In: D. R. Goldstein (ed.), Science and Statistics: A 
Festschrift for Terry Speed, IMS Lecture Notes - Monograph Series, Volume 
40, pp. 403-418.


Best,

Ramón




On Thursday 03 July 2003 14:48, Xavier Solé wrote:
> We have seen that the effect of the competitive hybridization is no so
> relevant for cDNA microarrays. In fact, Affy arrays hybridize just one
> sample per chip.
>
> To perform quantile normalization, look at the LIMMA package, Ramon.
>
> Cheers,
>
> Xavi.
>
> ----- Original Message -----
> From: "Ramon Diaz-Uriarte" <rdiaz at cnio.es>
> To: "Xavier Solé" <x.sole at ico.scs.es>; <w.huber at dkfz-heidelberg.de>;
> "bioconductor" <bioconductor at stat.math.ethz.ch>
> Sent: Thursday, July 03, 2003 1:31 PM
> Subject: Re: [BioC] normalization and analysis of connected designs
>
> > Dear Savi,
> >
> > Thanks for the comment; that option (as well as Wolfgangs comments),
> > seems
>
> to
>
> > me a puzzling possibility... It would be really nice, but I am not sure I
>
> see
>
> > how one would be able to do it (see also Gordon Smith's comments in this
> > thread).
> >
> > By the way, is there any package for quantile normalization for cDNA
>
> arrays?
>
> > Best,
> >
> > Ramón
> >
> > On Wednesday 02 July 2003 18:04, Xavier Solé wrote:
> > > If you use a quantile normalization and have each channel replicated at
> > > least twice you may be able to do comparisons of the intensities of
> > > different channels, even though they are not connected.
> > >
> > > Regards,
> > >
> > > Xavi.
> > >
> > > ----- Original Message -----
> > > From: "Ramon Diaz-Uriarte" <rdiaz at cnio.es>
> > > To: <w.huber at dkfz-heidelberg.de>; "bioconductor"
> > > <bioconductor at stat.math.ethz.ch>
> > > Sent: Wednesday, July 02, 2003 5:52 PM
> > > Subject: Re: [BioC] normalization and analysis of connected designs
> > >
> > > > Dear Wolfgang,
> > > >
> > > > Thank you very much for your answer. A couple of things I don't see:
> > > > > Another point: It may not always be true that
> > > > >
> > > > > [1] h_3G - h_3R + h_2G - h_2R + h_1G - h_1R
> > > > >
> > > > > is a better estimate for the D-A comparison than
> > > > >
> > > > > [2] h_3G - h_1R
> > > > >
> > > > > Here, h_3G is the green channel on array 3, h_1R the red on array
> > > > > 1, and so on. For good arrays, [2] should have a three times lower
> > > > > variance. However, [1] may be able to correct for spotting
> > > > > irregularities between the chips. Thus which is better depends on
>
> the
>
> > > > > data and the quality of
> > >
> > > the
> > >
> > > > > chips. You may want to try both.
> > > >
> > > > I am not sure I follow this. I understand that, __if__ D and A had
>
> been
>
> > > > hybridized in the same array, then the variance of their comparison
>
> would
>
> > > be
> > >
> > > > a third of the variance of the comparison having to use the
> > > > (two-step) connectiion between A and D. But I am not sure I see how
> > > > we can
>
> directly
>
> > > do
> > >
> > > > h_3G - h_1R
> > > > (if this were possible, then, there would be no need to use connected
> > > > designs.)
> > > >
> > > > They way I was seeing the above set up was:
> > > > from h_3 we can estimate phi_3 = D - C (as the mean log ratio from
> > > > the
> > >
> > > arrays
> > >
> > > > of type 3),
> > > > from h_2, phi_2 = C - B
> > > > from h_1, phi_1 = B - A
> > > > phi_1, phi_2, and phi_3 are the three basic estimable effects.
> > > >
> > > > Since I want D - A, I estimate that from the linear combination of
> > > > the
> > >
> > > phis
> > >
> > > > (which here is just the sum of the phis).
> > > >
> > > > This is doing it "by hand"; I think that if we use a set up such as
>
> the
>
> > > ANOVA
> > >
> > > > approach of Kerr, Churchill and collaborators (or Wolfinger et al),
> > > > we end
> > >
> > > up
> > >
> > > > doing essentially the same (we eventually get the "VG" effects), and
>
> we
>
> > > still
> > >
> > > > need a connected design.
> > > >
> > > > So either way, I don't get to see how we can directly do
> > > > h_3G - h_1R
> > > >
> > > > But then, maybe I am missing something obvious again...
> > > >
> > > >
> > > > Best,
> > > >
> > > > Ramón
> > > >
> > > > > Best regards
> > > > >
> > > > >   Wolfgang
> > > > >
> > > > > On Tue, 1 Jul 2003, Ramon Diaz wrote:
> > > > > > Suppose we have an experiment with cDNA microarrays with the
> > >
> > > structure:
> > > > > > A -> B -> C -> D
> > > > > > (i.e., A and B hybridized in the same array, A with Cy3, B with
>
> Cy5;
>
> > > > > > B and C in the same array, with B with Cy3, etc).
> > > > > >
> > > > > > In this design, and if we use log_2(R/G), testing A == D is
> > > > > > straightforward since A and D are connected and we can express D
> > > > > > -
>
> A
>
> > > as
> > >
> > > > > > the sum of the log ratios in the three arrays.
> > > > > >
> > > > > > But suppose we use some non-linear normalization of the data,
> > > > > > such
>
> as
>
> > > > > > loess as in Yang et al. 2002 (package marrayNorm) or the variance
> > > > > > stabilization method of Huber et al., 2002 (package vsn).  Now,
>
> the
>
> > > > > > values we have after the normalization are no longer log_2(R/G)
>
> but
>
> > > > > > something else (that changes with, e.g., log_2(R*G)).  Doesn't
>
> this
>
> > > > > > preclude the simple "just add the ratios"? Is there something
>
> obvious
>
> > > I
> > >
> > > > > > am missing?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Ramón
> > > > >
> > > > > _______________________________________________
> > > > > Bioconductor mailing list
> > > > > Bioconductor at stat.math.ethz.ch
> > > > > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
> > > >
> > > > --
> > > > Ramón Díaz-Uriarte
> > > > Bioinformatics Unit
> > > > Centro Nacional de Investigaciones Oncológicas (CNIO)
> > > > (Spanish National Cancer Center)
> > > > Melchor Fernández Almagro, 3
> > > > 28029 Madrid (Spain)
> > > > Fax: +-34-91-224-6972
> > > > Phone: +-34-91-224-6900
> > > >
> > > > http://bioinfo.cnio.es/~rdiaz
> > > >
> > > > _______________________________________________
> > > > Bioconductor mailing list
> > > > Bioconductor at stat.math.ethz.ch
> > > > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
> >
> > --
> > Ramón Díaz-Uriarte
> > Bioinformatics Unit
> > Centro Nacional de Investigaciones Oncológicas (CNIO)
> > (Spanish National Cancer Center)
> > Melchor Fernández Almagro, 3
> > 28029 Madrid (Spain)
> > Fax: +-34-91-224-6972
> > Phone: +-34-91-224-6900
> >
> > http://bioinfo.cnio.es/~rdiaz
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

-- 
Ramón Díaz-Uriarte
Bioinformatics Unit
Centro Nacional de Investigaciones Oncológicas (CNIO)
(Spanish National Cancer Center)
Melchor Fernández Almagro, 3
28029 Madrid (Spain)
Fax: +-34-91-224-6972
Phone: +-34-91-224-6900

http://bioinfo.cnio.es/~rdiaz



More information about the Bioconductor mailing list