[BioC] Statistical comparison of low replicate affy data

Matthew Hannah Hannah at mpimp-golm.mpg.de
Wed Feb 18 14:31:59 MET 2004


I'm looking at how different analyses of affy data perform on a sample data 
set of 2 conditions (untr vs. trt) with 3 biological reps (Ua, Ub, Uc vs. 
Ta, Tb, Tc). I've computed RMA and GCRMA expression measures as standard and 
so have 2 exprSets containing these values.

I've looked at fold changes and the treatment leads to many (1000-RMA, 1600 -
GCRMA) pairwise 2x changes (Ua-Ta, Ub-Tb, Uc-Tc all >2). In order to estimate 
the false positive rate I made pairwise comparisons within groups (Ua-Ub, Ua-Ub,
Ub-Uc and same for T) and was suprised to see that with only 3 reps there were 
very few genes that met the 2x criteria by chance (<5 - RMA, <10 GCRMA). What 
are peoples views on estimating false positives in this way?

I now want to make some statistical comparisons of the data both paired and un-
paired. I was thinking of making these comparisons first Ua-Ta.. (paired) and 
then Uabc-Tabc (unpaired) and then permutate the data so to compare Ua,Tb,Uc - 
Ta,Ub,Tc....etc in various combinations paired and un-paired. Would this provide
reliable false positive rates?

I have looked into the BioC packages and guess I'll use a t-test with multest 
correction, LPE (although this doesn't say it accepts RMA-type data?) and SAM, 
EBAM & EBAM.WILC from siggenes. Are there others I should also consider?

My request for help is if people have experience of applying these tests to 
affy data, specifically in the form of the RMA style exprSets my data is 
currently in, could they possibly post or send the r-scripts they used. I've 
obviously searched BioC and help but my attempts so far have returned errors 
and I can't help but think I'm missing something obvious (need to get rid of 
the affy ID's?) and obviously help would speed things up a great deal.

For example
> cl <- c(0,0,0,1,1,1)
> rmasam <- sam(esetrma, cl)
returned - Error in var(v) : missing observations in cov/cor

> rmaebw <- ebam.wilc(esetrma, cl)
returned - Error in 2^data : non-numeric argument to binary operator

Obviously if anyone is interested in what results I (eventually) obtain then 
let me know.



More information about the Bioconductor mailing list