[BioC] se.exprs

Rafael A. Irizarry ririzarr at jhsph.edu
Sun Jul 6 23:44:13 MEST 2003


what you did to get se.rma is correct if you believe each gene in raw.data
is equally expressed across arrays, i.e. if the arrays are technical
replicates. if not there are variances included in se.rma not included 
in se.model.rma. 

if you have technical replicates then
if 1) the model assumed by rma is exactly right and 2) the number of
arrays is very large then se.model.rma should be approximately equal to 
se.rma. i dont think you have either: the model assumed by rma
is useful but not perfect and i doubt you have 100s of arrays.

if you don't have tech reps you will have to adjusts the
summary step with an appropriate model to get a more useful se.model.rma.
using rlm instead of median polish gives more satisfying results.

also notice that it is very common, not just for microarray data, that the
sample standard deviations are larger than nominal standard deviation.

for example:

x <- matrix(rnorm(100000),10000,10)
means <- apply(x,1,mean)
nominal.sd <- apply(x,1,sd)/sqrt(10)
sample.sd <- sd(means)
##percent sample.sd bigger than nonminal.sd should be about 53%-58%
mean(sample.sd > nominal.sd)

hope this helps,
rafael

On Sun, 6 Jul 2003, Mary Putt wrote:

> Hi,
> I have a question about the output of se.exprs. I normalized a data set
> using expresso and the model from Irizarry et al (2003) Biostatistics
> paper.
> 
> rma.dta<-expresso(raw.data, normalize.method="quantiles.robust",
> bkground.correct="rma", summary.method="medianpolish",
> pmcorrect.method="pmonly")
> 
> I can extract the expression values using
> 
> exprs.rma<-exprs(rma.dta)
>  and  could compute a sample standard deviation from
> 
> se.rma<-sqrt(apply(exprs.rma, 1, var)). The square of this estimates the
> 
> variance of the individual expression values.
> 
> I can also extract the model-based estimate of the standard deviation
> from
> 
> se.model.rma<-se.exprs(rma.dta)
> 
> I'm not sure whether se.model.rma (squared) estimates the variance of
> the individual espression values. I have a pdf file that explains some
> of the notation if this is helpful--but it is too long to be accepted by
> standard bioconductor means--I can send to individuals if this is
> helpful. 
> 
> 
> This question was motivated by my observation that the sample standard
> deviation is substantially larger than the model-based standard
> deviation in a dataset that I am working with.
> 
> Many thanks in advance, Mary Putt
>



More information about the Bioconductor mailing list