[BioC] Re: technical replicates (again!): a summary

Thu Apr 1 20:14:41 CEST 2004

If you treat technical replicates as fixed effects in an RCB with within 
block replicates, you get the wrong error term for testing the fixed 
effects.  So, I think this only works if you then designate rep*treatment 
as the error.

--Naomi

At 11:27 AM 4/1/2004 +0200, Ramon Diaz-Uriarte wrote:
>Sorry, Gordon, you are right. My fault.
>In fact, wouldn't that be a good way to go, and prevent problems from
>convergence with REML, specially if we don't care much about the random
>effect and within subject replication is small (i.e., number tech. reps.
>small)?
>
>R.
>
>
>On Thursday 01 April 2004 00:18, Gordon Smyth wrote:
> > Hi Ramon,
> >
> > You've left out an important strategy, which I've suggested a couples of
> > times recently, which is to fit the technical replicates a fixed factor
> > rather than a random factor.
> >
> > Cheers
> > Gordon
> >
> > At 01:49 AM 1/04/2004, you wrote:
> > >Dear Gordon, Naomi, and BioC list,
> > >
> > >The issue of how to deal with technical replicates (such as those obtained
> > >when we do dye-swaps of the same biological samples in cDNA arrays) has
> > > come up in the BioC list several times. What follows is a short summary,
> > > with links to mails in BioC plus some questions/comments.
> > >
> > >
> > >There seem to be three major ways of approaching the issue:
> > >
> > >
> > >1. Treat the technical replicates as ordinary replicates
> > >*************************************************************
> > >E.g., Gordon Smyth in sept. 2003
> > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-September/00240
> > >5.html)
> > >
> > >However, this makes me (and Naomi Altman ---e.g.,
> > >https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-December/003340.
> > >html)
> > >
> > >uneasy (tech. reps. are not independent biological reps. which leads to
> > > the usual inflation of dfs and deflation of se).
> > >
> > >I guess part of the key to Gordon's suggestion is his comment that even if
> > >the
> > >s.e. are slightly underestimated, the ordering is close to being the
> > > optimal one. But I don't see why the ordering out to be much worse if we
> > > use the means of technical replicates as in 3. below. (Haven't done the
> > > math, but it seems that, specially in the pressence of strong tech. reps.
> > > covariance and small number of independent samples we ought to be better
> > > of using the means of the tech. reps).
> > >
> > >
> > >2. Mixed effects models with subject as random effect (e.g., via lme).
> > >**************************************************************************
> > >****
> > >
> > >In late August of 2003 I asked about these issues, and Gordon seemed to
> > > agree that trying the lme approach could be a way to go.
> > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-August/002224.h
> > >tml).
> > >
> > >However, in September, I posted an aswer and included code that, at least
> > > for some cases, shows potential problems with using lme when the number
> > > of technical replicates is small.
> > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-September/00243
> > >0.html)
> > >
> > >Nevertheless, Naomi Altman reports using lme/mixed models in a couple of
> > >emails
> > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-December/003191
> > >.html;
> > >
> > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2004-January/003481.
> > >html).
> > >
> > >After reading about randomizedBlock (package statmod) in a message in BioC
> > > (I think from Gordon), I have tried aggain the mixed models approach,
> > > since with tech. reps and no other random effects, we should be able to
> > > use
> > >randomizedBlock. Details in 5. below:
> > >
> > >
> > >3. Take the average of the technical replicates
> > >****************************************************
> > >Except for being possibly conservative (and not estimating tech. reps.
> > >variance component), I think this is a "safe" procedure.
> > >This is what I have ended up doing routinely after my disappointing tries
> > >with
> > >lme
> > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-September/00243
> > >0.html) and what Bill Kenworthy seemed to end up doing
> > >(https://www.stat.math.ethz.ch/pipermail/bioconductor/2004-January/003493.
> > >html).
> > >
> > >I think this is also what is done at least some times in literature (e.g.,
> > >Huber et al., 2002, Bioinformatics, 18, S96--S104 [the vsn paper]).
> > >
> > >*********
> > >
> > >4. Dealing with replicates in future versions of limma
> > >***********************************************************
> > >
> > >Now, in Sept. 2004 Gordon mentioned that an explicit treatment of tech.
> > > reps. would be in a future version of limma
> > >(
> > >https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-September/002411
> > >.html) and I wonder if Gordon meant via mixed-effects models, or some
> > > other way, or if there has been some progress in this area.
> > >
> > >
> > >
> > >5. Using randomizedBlock
> > >*****************************
> > >In a simple set up of control and treatment with dye-swaps, I have done
> > > some numerical comparisons of the outcome of a t-test on the mean of the
> > > technical replicates with lme and with randomizedBlock. [The function is
> > > attached]. The outcome of the t-test of the means of replicates and
> > > randomizedBlock, in terms of the t-statistic, tends to be the same (if we
> > > "positivize" the dye swaps). The only difference, then, lies in the df we
> > > then use to put a p-value on the statistic. But I don't see how we can
> > > use the dfs from randomizedBlock: they seem way too large. Where am I
> > > getting lost?
> > >
> > >
> > >Best,
> > >
> > >
> > >R.
> > >
> > >
> > >
> > >--
> > >Ramón Díaz-Uriarte
> > >Bioinformatics Unit
> > >Centro Nacional de Investigaciones Oncológicas (CNIO)
> > >(Spanish National Cancer Center)
> > >Melchor Fernández Almagro, 3
> > >28029 Madrid (Spain)
> > >Fax: +-34-91-224-6972
> > >Phone: +-34-91-224-6900
> > >
> > >http://bioinfo.cnio.es/~rdiaz
> > >PGP KeyID: 0xE89B3462
> > >(http://bioinfo.cnio.es/~rdiaz/0xE89B3462.asc)
>
>--
>Ramón Díaz-Uriarte
>Bioinformatics Unit
>Centro Nacional de Investigaciones Oncológicas (CNIO)
>(Spanish National Cancer Center)
>Melchor Fernández Almagro, 3
>28029 Madrid (Spain)
>Fax: +-34-91-224-6972
>Phone: +-34-91-224-6900
>
>http://bioinfo.cnio.es/~rdiaz
>PGP KeyID: 0xE89B3462
>(http://bioinfo.cnio.es/~rdiaz/0xE89B3462.asc)
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor