[BioC] Limma then Annaffy?

Thu Dec 29 06:19:08 CET 2005

On Wed, 2005-12-28 at 16:58 -0800, davidl at unr.nevada.edu wrote:
> Hello all,
> 
>      Ive been searching the mail archives for about 2 hours without any luck on
> finding an answer to this problem.  I was just wondering if there was a way to
> take the results that I obtained from limma and plug them into some sort of
> annotation package (such as annaffy).  I am using affymetrix moe4302 gene chips
> and would like to learn more about the genes limma found to be differentially
> expressed between my two groups.  I was able to use multtest and annaffy to
> create the html table, but I would really like to use the differentially
> expressed genes from limma (the ones you see in topTable, etc.)
> 
The object returned by topTable should contain an item called "ID" that
contains the gene identifiers.  So after fitting the model with lmFit
and calculating the p-values with eBayes, I extract annotation for the
top n genes (in the example below, the gene symbol for the top 100
genes) as follows:

topGenes <-topTable(fit2)
topSymbols <- getText(aafSymbol(topGenes$ID[1:100], "hgu133plus2"))

And similarly for other annotations.  I presume you could do the same
for the moe4302 chip.

>      Also, in searching the mail archives, I saw a few emails indicating that
> the adjusted p-values and B values found with limma should not be considered
> absolutely correct because of assumptions limma makes about normality (or
> something along those lines).  Does this mean that it would be wise to use
> another package (or another program?) to find p-values for differential
> expression?  I would like to find p-values at some point that are meaningful,
> to a certain extent, on their own, as opposed to p-values which indicate just
> the relative order of differential expression among genes (and aren't
> associated with an actual probability of the absence of differential
> expression)(that was worded weird, sorry).  If the assumptions about normality
> are the problem, is there a wilcoxon type test that would come reccomended as
> part of a bioconductor package? I'm interested in using a fdr type adjustment
> for deciding my p-value cut-offs.  Is there any concensus as to the best way to
> do this right now?
> 

"Not absolutely correct" does not mean "not meaningful."  You could
certainly test the normality of your data (say, with ks.test), and then
use the wilcoxon rank sum if it violates this assumption.  But I would
say the assumption of independence is the bigger problem, as genes are
clearly not independent of one another.

On the other hand, the limma package has been prepared by bright and
thoughtful individuals, peer-reviewed, and usefully employed in the name
of scientific discovery, so I would make the empiric argument that it's
modeling of the distribution of microarray data is not half bad.  And
knowing which genes are most likely differentially expressed is of
greater interest to me than the precise p-value (as long as the
estimated p-values are sufficiently miniscule to suggest a low FDR even
if they are off by a couple factors of 10).

>       Basically, Im just really overwhelmed by the variety of analysis methods
> that exist right now for microarrays.  I'm sorry if the answer to my first
> question is located in a conspicuous place that I happened to miss and I'm very
> appreciative of any any help that anyone would like to offer.
> 

At the end of the day, after consulting your colleagues and the
literature, you just have to pick a method out of the many available
that makes sense to you for your analysis, and see if this method
produces results that are useful, replicable, and validatable. My
experience thus far suggests that, when present, the major signals in
microarray data tend to come through despite sub-optimal analytic
methodology.  I do not mean to say that rigor or optimization aren't
good (because they are very good), but I do mean to say that paralysis
is bad. Like my Dad used to say...if it works, it ain't broke.

I hope that is slightly helpful.

-Eric

> Thank you very much,
> 
> Dave
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor

This email message, including any attachments, is for the so...{{dropped}}