[BioC] Statistical comparison of low replicate affy data

James MacDonald jmacdon at med.umich.edu
Wed Feb 18 16:44:39 MET 2004

For siggenes you can't pass an exprSet. You have to pass a matrix of
expression values, e.g., exprs(esetrma). Since the affy IDs will be the
row names of the matrix, you don't have to get rid of them.

Also note that any permutation testing you do with these data is going
to be a bit sketchy because the null distribution is going to be
extremely coarse. For instance, there are only 20 combinations of your
data that you can use to estimate the null in the unpaired case. A
general (minimum) recommendation for estimating the null is to use 500 -
1000 permutations which you obviously don't have.

With these data you might be better off using limma for the analysis
and use a non-permuted fdr to estimate false positives. There is an
example in the online manual for limma that does almost exactly what you
are trying to accomplish.



James W. MacDonald
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109

>>> "Matthew  Hannah" <Hannah at mpimp-golm.mpg.de> 02/18/04 08:31AM >>>

I'm looking at how different analyses of affy data perform on a sample
set of 2 conditions (untr vs. trt) with 3 biological reps (Ua, Ub, Uc
Ta, Tb, Tc). I've computed RMA and GCRMA expression measures as
standard and 
so have 2 exprSets containing these values.

I've looked at fold changes and the treatment leads to many (1000-RMA,
1600 -
GCRMA) pairwise 2x changes (Ua-Ta, Ub-Tb, Uc-Tc all >2). In order to
the false positive rate I made pairwise comparisons within groups
(Ua-Ub, Ua-Ub,
Ub-Uc and same for T) and was suprised to see that with only 3 reps
there were 
very few genes that met the 2x criteria by chance (<5 - RMA, <10
GCRMA). What 
are peoples views on estimating false positives in this way?

I now want to make some statistical comparisons of the data both paired
and un-
paired. I was thinking of making these comparisons first Ua-Ta..
(paired) and 
then Uabc-Tabc (unpaired) and then permutate the data so to compare
Ua,Tb,Uc - 
Ta,Ub,Tc....etc in various combinations paired and un-paired. Would
this provide
reliable false positive rates?

I have looked into the BioC packages and guess I'll use a t-test with
correction, LPE (although this doesn't say it accepts RMA-type data?)
and SAM, 
EBAM & EBAM.WILC from siggenes. Are there others I should also

My request for help is if people have experience of applying these
tests to 
affy data, specifically in the form of the RMA style exprSets my data
currently in, could they possibly post or send the r-scripts they used.
obviously searched BioC and help but my attempts so far have returned
and I can't help but think I'm missing something obvious (need to get
rid of 
the affy ID's?) and obviously help would speed things up a great deal.

For example
> cl <- c(0,0,0,1,1,1)
> rmasam <- sam(esetrma, cl)
returned - Error in var(v) : missing observations in cov/cor

> rmaebw <- ebam.wilc(esetrma, cl)
returned - Error in 2^data : non-numeric argument to binary operator

Obviously if anyone is interested in what results I (eventually) obtain
let me know.



Bioconductor mailing list
Bioconductor at stat.math.ethz.ch 

More information about the Bioconductor mailing list