[BioC] Hypergeometric test in ChIPpeakAnno

Zhu, Lihua (Julie) Julie.Zhu at umassmed.edu
Tue Jun 28 19:12:47 CEST 2011


Abhishek,

The totalTest is the total number of potential genomic regions you sampled
to obtain the peaks. It should be much larger than the number of peaks in
any of your peak files.  Using merged peak file would most likely lead to
underestimate of the totalTest.

Noah has given excellent suggestions on estimating totalTest for ChIP-seq
experiment for different scenarios at
https://stat.ethz.ch/pipermail/bioconductor/2010-November/036540.html.

FYI, I will be giving a practical tutoring in the Bioconductor meeting in
Seattle https://secure.bioconductor.org/BioC2011/labs.php . We could discuss
it further face to face if you happen to attend the meeting as well.
Otherwise, we could schedule a time to talk if needed.

Best regards,

Julie

On 6/28/11 10:47 AM, "Abhishek Singh" <abhisheksinghnl at gmail.com> wrote:

> 
> Dear Prof. Julie,
> 
> What I could comprehend from the examples and the threads, the best way to
> compute  the totalset value would be, to merge all the peak files understudy
> (for which a VennDiagram is desired) and than count number of genomic regions
> present in the merged peak file. The total number of genomic regions present
> in merged peakfile can be used as the value for the Totalset.
> 
> Please correct me if I am wrong.
> 
> Regards
> Abhishek
> 
> On Tue, Jun 28, 2011 at 4:02 PM, Zhu, Lihua (Julie) <Julie.Zhu at umassmed.edu>
> wrote:
>> Abihishek,
>> 
>> This is a very good question which has been very nicely addressed by Noah.
>> Please follow the following email threads in the Biocondutor mailing
>> archives.
>> https://stat.ethz.ch/pipermail/bioconductor/2010-November/036540.html
>> http://permalink.gmane.org/gmane.science.biology.informatics.conductor/29476
>> http://permalink.gmane.org/gmane.science.biology.informatics.conductor/30115
>> 
>> Please cc bioconductor <bioconductor at stat.math.ethz.ch> so that others could
>> benefit and/or contribute. Thanks!
>> 
>> Best regards,
>> 
>> Julie
>> 
>> 
>> On 6/28/11 4:29 AM, "Abhishek Singh" <abhisheksinghnl at gmail.com> wrote:
>> 
>>> Dear prof. Julie,
>>> 
>>> I have one more question regarding the hypergeometric test implemented in
>>> ChIPpeakAnno package for construction of Venn diagram.
>>> 
>>> The command you gave for the sample data in the article is:
>>> 
>>>> makeVennDiagram(RangedDataList(Peaks.Ste12.Replicate1,
>>>> Peaks.Ste12.Replicate2, Peaks.Ste12.Replicate3), NameOfPeaks =
>>>> c("Replicate1","Replicate2","Replicate3"), maxgap = 0, totalTest = 1580)
>>> 
>>> Where totalTest indicates how many peaks in total that is used in
>>> hypergeometric test (as indicated in article).
>>> 
>>> Imagine I have a three data sets:
>>> (a) Dataset A has 100 peaks
>>> (b) Dataset B has 150 peaks
>>> (c) Dataset C has 75 peaks
>>> 
>>> How can I compute the value of totalTest for these three data sets?
>>> 
>>> Thank you for  your time,
>>> Looking forward for your reply,
>>> 
>>> Regards
>>> Abhishek
>>> 
>>> 
>>> On Mon, Jun 27, 2011 at 7:45 PM, Abhishek Singh <abhisheksinghnl at gmail.com>
>>> wrote:
>>>> Dear Prof. Julie,
>>>> 
>>>> Thank you, for providing code.
>>>> 
>>>> Best Regards,
>>>> Abhishek
>>>> 
>>>> 
>>>> On Mon, Jun 27, 2011 at 5:47 PM, Zhu, Lihua (Julie)
>>>> <Julie.Zhu at umassmed.edu>
>>>> wrote:
>>>>> Abhishek,
>>>>> 
>>>>> Please try the following code snippets assuming your bed file is test1.bed
>>>>> without header.
>>>>> 
>>>>> library(ChIPpeakAnno)
>>>>> test1.bed=read.table("~/Document/test1.bed", sep="\t", skip=0,
>>>>> header=FALSE)
>>>>> myPeakList = BED2RangedData(test1.bed,header=FALSE)
>>>>>   
>>>>> 
>>>>> Now you can use annotatePeakInBatch to annotate myPeakList.
>>>>> 
>>>>> For detailed information on how to use ChIPpeakAnno package, please refer
>>>>> to
>>>>> http://www.bioconductor.org/packages/2.8/bioc/vignettes/ChIPpeakAnno/inst/
>>>>> do
>>>>> c/ChIPpeakAnno.pdf,
>>>>>  http://www.bioconductor.org/help/course-materials/2010/BioC2010/BioC2010_
>>>>> Ch
>>>>> IPpeakAnno.pdf
>>>>> And Zhu L.J. et al. (2010) ChIPpeakAnno: a Bioconductor package to
>>>>> annotate
>>>>> ChIP-seq and ChIP-chip data. BMC Bioinformatics 2010,
>>>>> 11:237doi:10.1186/1471-2105-11-237.
>>>>> 
>>>>> 
>>>>> Best regards,
>>>>> 
>>>>> Julie
>>>>> 
>>>>> 
>>>>> On 6/27/11 7:40 AM, "Abhishek Singh" <abhisheksinghnl at gmail.com
>>>>> <http://abhisheksinghnl@gmail.com <http://gmail.com> > > wrote:
>>>>> 
>>>>>> hi!
>>>>>> 
>>>>>> I was trying to use your R package ChIPpeakAnno to annotate my peak files
>>>>>> which are in .bed format.
>>>>>> 
>>>>>> Somehow I am unable to tell your package to load my input files and
>>>>>> perform
>>>>>> analysis.
>>>>>> 
>>>>>> To brief you what exactly I intend to do, I have a peak file (form MACS
>>>>>> in
>>>>>> .bed format) and I want to give this file as an input to your package in
>>>>>> R
>>>>>> (which is already installed).
>>>>>> 
>>>>>> could you roughly tell me what exactly should I do so that the package
>>>>>> starts reading my files as an input.
>>>>>> 
>>>>>> Thank you for your time.
>>>>>> 
>>>>>> Looking forward for your reply.
>>>>>> 
>>>>>> Regards
>>>>>> Abhishek A. Singh
>>>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
> 
> 



More information about the Bioconductor mailing list