[BioC] question about topTable ranking of limma

Jenny Drnevich drnevich at illinois.edu
Tue Sep 15 21:14:22 CEST 2009


Hi,

It's not necessarily an unfair question to ask "which genes have high 
expression (i.e, many mRNAs) and which genes have low expression in 
this treatment?"  However, you cannot get a quantitative answer to 
this using microarrays, because the expression values between 
different genes are NOT directly comparable. Different probe 
sequences have different binding efficiencies (among other biases) 
such that the same number of mRNA copies of one gene may not lead to 
the same measured fluorescence value as the same number of mRNA 
copies as another gene.

You are also confused as to what value is measured on a single-color 
array versus a two-color array. A single color array measures the 
fluorescence value for that probe in that sample, whereas a two-color 
array log ratio value is the ratio of fluorescence values for that 
probe between samples. In your example, the log ratios are measuring 
the ratio of mutant to reference FOR THAT PARTICULAR SPOT, not the 
ratio of the mutant value of that spot to the average of the mutant 
value of all other spots.

I think you could do some sort of qualitative assessment of 
expression level, because a genes with a log2 expression value lower 
than 5  almost certainly have fewer mRNA copies than genes with log2 
expression values over say, 10. However, you cannot do any sort of 
statistical test because the fluorescence values are not directly 
comparable between genes. And finally, in the contest of microarray 
experiments, "differential expression" almost universally means 
differences in levels of ONE gene between TWO groups.

HTH,
Jenny

At 01:17 PM 9/15/2009, zrl wrote:
>Claus,
>
>Thank you for your response. However, at some points, I don't agree with
>you.
>The differentially expressed genes for just one group, I mean the genes
>whose average expression levels across the biological replicates (here  4
>replicates) are over/under the grand mean expression value. I think it's
>similar as the analysis of identifying the differently expressed genes in
>experiment of 4 replicates with two color arrays (Cy5 Mut, Cy3 Ref), which
>you got single log ratios for each gene across 4 biological replicates. For
>my design, I just measure the absolute expression value(single channel
>intensity).
>Therefore, when I fit the limma model, it actully evaluate the average
>expression level(intensity) for each gene across the replicates.
>Of course I may just rank the average intensities from high to low and
>compare them with mean to get the idea of differently expressed genes. But I
>believe limma can do better job, since I want not only ranking but also
>significant level. If the variability is hight among the replicates, the
>expression level for this gene maybe not reliable even the average is high
>for this gene. I just try to figure out a way to separate the over/under
>expressed values.
>
>If I was wrong, please let me know. Thank you.
>
>
>
>On Tue, Sep 15, 2009 at 12:55 PM, Mayer, Claus-Dieter 
><c.mayer at abdn.ac.uk>wrote:
>
> > Hi,
> >
> > With only one group you can not speak of "differentially expressed" and
> > testing, as that assumes that you have at least two different groups or
> > conditions. The test that you have performed probably just compares gene
> > expression to zero (a moderated one-sample t-test) and for that you would
> > expect all genes to be significant.
> >
> > What you (I am guessing) probably mean by "differentially expressed" is
> > that you are interested to find genes that vary highly between your 4
> > replicates. To find those the best you can do is to rank the genes with
> > respect to their variances/standard deviations. But you can't get a p-value
> > for this, because (unless all values are identical) any gene will have a
> > variance that is significantly higher than 0.
> >
> > Best Wishes
> >
> > Claus
> >
> > > -----Original Message-----
> > > From: bioconductor-bounces at stat.math.ethz.ch [mailto:bioconductor-
> > > bounces at stat.math.ethz.ch] On Behalf Of zrl
> > > Sent: 15 September 2009 17:18
> > > To: Heidi Dvinge
> > > Cc: bioconductor
> > > Subject: Re: [BioC] question about topTable ranking of limma
> > >
> > > Sorry for the incomplete message, click the send accidentally.
> > >
> > > This analysis is for only one group of 4 biological replicates such as:
> > >                        group
> > > array1              a
> > > array2              a
> > > array3              a
> > > array4              a
> > >
> > > I tried to identify the genes which are differently expressed in group a,
> > > but no other reference groups for comparison. Therefore, even all the t
> > > statistics are positive.
> > >
> > > Any thoughts? Thanks.
> > >
> > >
> > >
> > > On Tue, Sep 15, 2009 at 11:13 AM, zrl <zrl1974 at gmail.com> wrote:
> > >
> > > > Hi Heidi,
> > > >
> > > > Thank you for your response. Maybe I didn't make my question very
> > clear.
> > > > This analysis is for only one group of 4 biological replicates such as:
> > > >                        group
> > > > array1
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Sep 15, 2009 at 4:20 AM, Heidi Dvinge <heidi at ebi.ac.uk> wrote:
> > > >
> > > >>  Hello,
> > > >> you can just sort the topTable result by the t-statistics since these
> > > will
> > > >> be either positive or negative, or call it directly with sort.by="t"
> > > and
> > > >> then filter for significant p-values.
> > > >>
> > > >> HTH
> > > >> \Heidi
> > > >>
> > > >> On 15 Sep 2009, at 10:05, zrl wrote:
> > > >>
> > > >> Dear List,
> > > >>
> > > >> I have several biological replicates affy arrayes (a simple one group
> > 4
> > > >> arrayes), and tried to use eBayes to get the differentially expressed
> > > >> genes.
> > > >> The topTable ranked the genes by B statistics, which mixed over-
> > > expressed
> > > >> genes and under-expressed genes. My question is how I should separate
> > > the
> > > >> over and under expressed genes from topTable results. My idea is to
> > > >> calculate the mean average expressed value/intensities (extracted from
> > > >> topTable results with using the number of all the genes) and compare
> > > >> ranked
> > > >> genes with the mean value, if the expressed value is greater than the
> > > >> mean,
> > > >> I take this gene as over-expressed, otherwise, it's under-expressed.
> > > >> Since I don't know the underlying implement of topTable or eBayes, I
> > > want
> > > >> to
> > > >> make sure if my method is right. Or you have some better ideas.
> > Thanks.
> > > >>
> > > >> [[alternative HTML version deleted]]
> > > >>
> > > >> _______________________________________________
> > > >> Bioconductor mailing list
> > > >> Bioconductor at stat.math.ethz.ch
> > > >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> > > >> Search the archives:
> > > >> http://news.gmane.org/gmane.science.biology.informatics.conductor
> > > >>
> > > >>
> > > >>
> > > >
> > >
> > >         [[alternative HTML version deleted]]
> > >
> > > _______________________________________________
> > > Bioconductor mailing list
> > > Bioconductor at stat.math.ethz.ch
> > > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > > Search the archives:
> > > http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
> >
> > The University of Aberdeen is a charity registered in Scotland, No
> > SC013683.
> >
>
>         [[alternative HTML version deleted]]
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: 
>http://news.gmane.org/gmane.science.biology.informatics.conductor

Jenny Drnevich, Ph.D.

Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign

330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA

ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at illinois.edu



More information about the Bioconductor mailing list