[BioC] CAMERA for non-directional gene sets

Wu, Di dwu at fas.harvard.edu
Tue Sep 4 20:09:11 CEST 2012


Dear Simon,

CAMERA outputs the testing p values with the directions, up, down and either. In your example, you are right that CAMERA gives very large p value to indicate non-significant.

It doesn't output the non-directional p value (we also call the non diretional test as the test for the "mixed" direction), as we found the correlation effects in the test for the mixed direction is quite complicated. 
CAMERA is a competitive gene set test as you may know the major two types are competitive and self-contained. (see Goeman, J. J. and B¨uhlmann, P. (2007), and the CAMERA paper).
Generally, to test a mixed direction for a competitive hypothesis , Wilcoxon mean rank gene set test (wilcoxGST in limma) can be used although this method ignores gene-gene correlations. According to my experience, the correlation effect in the test of the mixed direction is much weaker than in the directional tests.

Another self-contained test ROAST can also be used to help understanding the test for the non-directional tests.
(http://bioinformatics.oxfordjournals.org/content/early/2010/07/07/bioinformatics.btq401.full.pdf)

Hope this help.

Di





----
Di Wu
Postdoctoral fellow
Harvard University, Statistics Department
Harvard Medical School
Science Center, 1 Oxford Street, Cambridge, MA 02138-2901 USA

________________________________________
From: Simon de Bernard [simon.debernard at altrabio.com]
Sent: Tuesday, September 04, 2012 12:06 PM
To: Wu, Di
Cc: bioconductor at r-project.org
Subject: Re: [BioC] CAMERA for non-directional gene sets

Dear Di,

thanks for your answer.

> Thank you for your interest in using CAMERA. It has lots of good feathers, holding correct false positive rate and having good power.

That's why it piqued my interest ;-)

>  It can be used when you have multiple gene sets, e.g., GO as you mentioned.
>
> Currently, the default test statistics for individual genes is the moderated t, which is a variant of the ordinary t. See Smyth 2004. (http://www.statsci.org/smyth/pubs/ebayes.pdf)
>
> It is up to the user whether ranks (of the moderated t) should be used or not.
>
> Of course, it is easily to edit the code to allow log fold change to represent the change of the individual genes. What other statistics for individual genes will you be interested in using?

Sorry for mixing up logFC and moderated t. However, isn't it still an approach only appropriate for "directional" gene sets?

Suppose I have a gene set for which I know that genes should be differentially expressed but not necessarily in the same direction. If half the genes in my set have a statistic of -10 and the other half of +10, won't the current implementation give me p=1 when I would expect significance?

Best regards,

Simon.

> It is also worth noting that, according to other users, it is safe to set “allow.neg.cor=FLASE”, to let correlation be zero when the actual calculated correlation is negative.
>
> Gordon may also have some insight regarding your question.
>
> Enjoy using CAMERA.
>
> Di
>
>
>
>
> ----
> Di Wu
> Postdoctoral fellow
> Harvard University, Statistics Department
> Harvard Medical School
> Science Center, 1 Oxford Street, Cambridge, MA 02138-2901 USA



More information about the Bioconductor mailing list