[BioC] How is LIMMA actually calculating the average expression value

Tue Jun 11 02:36:38 CEST 2013

Dear Miriam,

> Date: Sun, 9 Jun 2013 10:28:41 +0000
> From: "Garcia Orellana,Miriam" <mgarciao at ufl.edu>
> To: Bioconductor mailing list ?[bioconductor at r-project.org]?
> 	<bioconductor at r-project.org>
> Subject: [BioC] How is LIMMA actually calculating the average
> 	expression value?
>
> Dear all:

> I have found quite the same question being asked at least twice but I 
> still not have clear answer about how the ebayes method in LIMMA 
> calculate the average expression value for a given experimental group.

The limma package does not and never has computed the expression level for 
individual experimental groups.  The AveExpr columin is the average of all 
arrays (ie for all groups not for one group).  That is clearly documented, 
or so it seems to me.

The limma User's Guide gives in Section 13.1 a description of all 
quantities output by topTable.  It says

   "The AveExpr column gives the average log2-expression level for
   that gene across all the arrays and channels in the experiment."

That seems to be me to be completely unambiguous.  What is unclear about 
it?

The help page ?topTable says that AveExpr is

   "average log2-expression for the probe over all arrays and channels,
   same as Amean in the MarrayLM object"

The help page ?"MArrayLM-class" says that Amean is a

   "numeric vector containing the average log-intensity for each probe over
   all the arrays in the original linear model fit. Note this vector does
   not change when a contrast is applied to the fit using contrasts.fit."

Again this seem to me to be unambiguous.

I've said the same thing in response to questions on this list several 
times.  What is unclear?

> I have used GCRMA to normalize my affimetrix values and then obtained 
> the log2 expression values as (values below do not necessarily 
> correspond to the same probe):
>
> 4395_CTL_LLA.CEL        :7.89
> 4404_CTL_LLA.CEL:        8.21
> 4413_CTL_LLA.CEL:       8.07
> I have calculated by excel:
> Simple mean = 8.055
> Geometric mean = 8.054
> Whereas the top table for average expression of these 3 values gave me: 8.055

There is no such thing as a "top table for average expression".  Top 
tables are always for comparisons between groups.  I have no idea what you 
are trying to do.

Could you please read the documentation, and have a look at the posting 
guide?  If you post again, please give the whole code leading to this 
output, and give expression for all arrays in your experiment, not just 
three.

Best wishes
Gordon

> This values are quite the same regardless of calculation method, However 
> when more variability is among values the calculated average expression 
> differs differs quite largely:

> 4368_CTL_HLA.CEL:       8.26
> 4394_CTL_HLA.CEL:       7.17
> 4400_CTL_HLA.CEL:       8.70
> Simple mean = 8.042
> Geometric mean = 8.015
> Whereas the top table for average expression of these 3 values gave me: 8.263. In this case this average expression value seems to be the median but on the next set of samples not.

> 4368_CTL_HLA.CEL:       10.758
> 4394_CTL_HLA.CEL:       10.907
> 4400_CTL_HLA.CEL:       7.634
> Simple mean = 8.92
> Geometric mean = 9.766
> Whereas the top table for average expression of these 3 values gave me: 9.862, which in this case is not at all close to the median.
>
> I will appreciate any help on this matter. It will also be appreciated, 
> any additional though on whether the adjusted average expression 
> (whichever the method is) is well enough to correct for expression 
> variability within a given treatment, so I do not need to be worry for 
> any potential outlier expression value or should I be concerned about?
>
> Regards,
> Miriam
>
> ********************************
> Miriam Garcia, MS, PhD
> Department of Animal Sciences
> University of Florida
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}