[BioC] enrichment packages that accept t-stat (or related stat) as input
Gordon K Smyth
smyth at wehi.EDU.AU
Fri Oct 25 01:55:57 CEST 2013
Dear Pekka,
Thanks for your comments and for your interest in the camera procedure.
In general, it is not possible to run the camera() or roast() procedures
on pre-computed test statistics. Both functions need to have all the
expression data in order to estimate the inter-gene correlations.
I agree that the same set of VIF would apply to different comparisons for
the same linear model, so in principle we could store the VIF to allow
them to be reused for different contrasts for the same data. The syntax
for that would require quite a bit of thought however and you are the
first to ask for it. The camera() function is very fast as it is, so we
are not planning to make such a modification in the very near term.
It may be possible to generalize camera() to F-statistics, but it will be
a serious mathematical research project to work out the appropriate VIF
and test modifications. You won't be able to do that in a valid way
simply by hacking the code.
Regarding write.fit(), you could simply use
out <- camera(...)
write.table(out, file="cameraresults.txt")
Would that satisfy your needs?
In recent limma releases, we have modified mroast() to give output as a
data.frame, so that the format is more like that from camera(), as part of
our continuing development of gene set methods.
Best wishes
Gordon
-------------- original message ---------------
[BioC] enrichment packages that accept t-stat (or related stat) as input
Pekka Kohonen pkpekka at gmail.com
Thu Oct 24 12:00:46 CEST 2013
Dear Juliet, Gordon,
I am also looking into using pre-computed camera statistics, both to speed
up computation for a webservice and also to enable statistics, such as
F-statistic to be used that are not currently supported by the
limma/camera package (AFAIK). So I am trying to de-compose the
limma/camera-function to be able to make use of pre-computed statistics. I
wonder if someone has already done so? Could the F-statistic (as estimated
by the write.fit function for instance) be used in camera directly, or are
there some statistical assumptions that are violated? Probably using the
rank-based version is the safest option.
It seems to me that in order to use as much as possible pre-computed
statistics in limma (when the gene sets are not known in advance) you can
pre-compute the limma/ebayes gene wise statistics and array weights. But
you have to still estimate the variance inflation factor for each gene
set. But the same factor can be used for all the comparisons in the linear
model.
It would be nice to have a "write.fit" type function for the gene-set
tests as well. It is one of my favorite functions in limma.
I have used GSVA to perform linear modelling for gene set testing as well,
but don't completely trust the statistical validity of the results. Maybe
setting the trend=TRUE would alleviate some considerations about
assumptions about normality being violated. Also it needs at least 10
samples (apparently) to estimate the distribution of gene set statistics.
But that is OK for dose-response modelling.
Thank you Gordon for your work on the limma! I am also finding the "voom"
to be a really nice function and have used it to analyze laber-free
proteomics experimetns as well.
Best Regards,
Pekka
2013/8/30 Gordon K Smyth <smyth at wehi.edu.au>:
> Dear Juliet,
>
> Why not use the enrichment functions that are already part of the limma
> package? See
>
> ?roast
> ?camera
> ?romer
>
> and references there-in.
>
> Best wishes
> Gordon
>
>
>> Message: 19
>> Date: Thu, 29 Aug 2013 20:43:04 -0400
>> From: Juliet Hannah <juliet.hannah at gmail.com>
>> To: Robert Castelo <robert.castelo at upf.edu>
>> Cc: Bioconductor mailing list <bioconductor at r-project.org>
>> Subject: Re: [BioC] enrichment packages that accept t-stat (or related
>> stat) as input
>>
>> Hi Robert,
>>
>> Thanks for your response. I will look into it.
>>
>> Also is it correct GSVA always requires an expression matrix. It seems
>> that it integrates with limma, so if I have done an analysis in limma
>> does this mean that I should be able to use GSVA for an enrichment
>> analysis.
>>
>> Thanks,
>>
>> Juliet
>>
>>
>> On Thu, Aug 29, 2013 at 2:43 AM, Robert Castelo
>> <robert.castelo at upf.edu>wrote:
>>
>>> Juliet,
>>>
>>> i think the first 5 pages in the vignette entitled "Using Categories
to
>>> Analyze Microarray Data" from the Category package:
>>>
>>>
>>>
http://www.bioconductor.org/**packages/release/bioc/html/**Category.html<http://www.bioconductor.org/packages/release/bioc/html/Category.html>
>>>
>>> may be doing what you are looking for.
>>>
>>> cheers,
>>> robert.
>>>
>>>
>>> On 08/28/2013 08:04 PM, Juliet Hannah wrote:
>>>
>>>> All,
>>>>
>>>> I am looking for an Bioconductor enrichment package that does
>>>> something similar to GSEA for pre-computed test statistics. This
>>>> method would not rely on a cutoff. That is, rather than passing an
>>>> expression matrix, one can compute summarizes outside of the package
>>>> (such as a limma t), and then pass these. Any suggestions?
>>>>
>>>> Thanks,
>>>>
>>>> Juliet
>>>>
>>>>
>>> --
>>> Robert Castelo, PhD
>>> Associate Professor
>>> Dept. of Experimental and Health Sciences
>>> Universitat Pompeu Fabra (UPF)
>>> Barcelona Biomedical Research Park (PRBB)
>>> Dr Aiguader 88
>>> E-08003 Barcelona, Spain
>>> telf: +34.933.160.514
>>> fax: +34.933.160.550
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list