[BioC] Question regarding handling technical replicates for Affy arrays

Gordon K Smyth smyth at wehi.EDU.AU
Sun May 28 12:44:20 CEST 2006


Hi Noel,

Your code looks correct.  Have you looked at the value of the consensus correlation, to check that
it is reasonable?

Best wishes
Gordon

> Date: Sat, 27 May 2006 17:56:51 -0700 (PDT)
> From: "noel0925 at sbcglobal.net" <noel0925 at sbcglobal.net>
> Subject: [BioC] Question regarding handling technical replicates for
> 	Affy	arrays
> To: bioconductor at stat.math.ethz.ch
>
> Hi All,
>
> There seems to have been much discussion regarding
> the statistical issues surrounding handling of
> technical replicates in gene expression data using the
> Limma package as pertains to two-color arrays, but
> less so for Affy data.
>
> As such, I am still uncertain regarding how best to
> handle a single technical replicate in a dataset 40
> Affy arrays from 3 tumor types.
>
> Clearly, I want to model the biological variation
> between these three tumor types as a fixed effect.
>
> Limma, I understand, is only able to handle 1 random
> effect. Unlike two-color arrays, single color arrays
> do not necessitate allocating random effects to within
> array spot replicates or dye swaps so, I should still
> be able to model the lack of dependence between these
> two technical reps as a random effect using the
> duplicateCorrelation function.
>
> It is unclear to me how to specify this single
> technical replicate however. The examples I have read
> about in the limma manual or other desciptions of its
> functions usually deal with cDNA arrays and usually
> there is a nice structure to the technical
> replication- eg pairs of technical replicates for each
> sample (dye swap or otherwise). In my case, I only
> have  one technical rep (and 20, 10, and 10 biolgical
> reps).
>
> What seemed to make sense to me was:
>
>>classes<- c(rep(1,20), rep(2,10), rep(3,10))
>
>>f<- factor(targets$Target, levels = c("RNA1", "RNA2",
> "RNA3"))
>>design<- model.matrix(~0+f)
>>colnames(design)<- c("RNA1", "RNA2", "RNA3")
>>biolrep <- c(1,2,3,4,5,5,6,7,8,9??40)
>>corfit<- duplicateCorrelation(eset, design, ndups=1,
> block= biolrep)
>>fit<-lmFit(eset,design, ndups=1, block=biolrep
> ,cor=corfit$consensus)
>>contrast.matrix<- makeContrasts(RNA1-RNA3,
> RNA2-RNA3,RNA1-RAN2, levels=design)
>>fit2<- contrasts.fit(fit, contrast.matrix)
>>fit2<-eBayes(fit2)
>
> How else can I specify biolrep?
>
>
>
> Obviously it would be preferable to model the lack of
> dependence rather than go with the alternatives.
>
> i.) Averaging the two technical reps so as to treat
> them as primary data.
> I assume this would be done post-normalization and
> summarization?
>
>
> ii.) I have also read that simply treating the
> technical rep as a biological rep is not too dangerous
> since- the measurement error can be larger than the
> biological variation- but am hesitant since I want to
> perform downstream analysis like clustering and don't
> want an 'extra' sample that really comes from the same
> patient.
>
> iii.) Discard the data from one of the technical reps
> (worse still).
>
> I would greatly appreciate any insight into the best
> way to handle this issue.
>
> Thanks in advance.
> Noelle



More information about the Bioconductor mailing list