[BioC] ChIPpeakAnno venn diagram statistics

Ester Feldmesser ester.feldmesser at weizmann.ac.il
Wed Dec 8 09:14:56 CET 2010


Thank you very much for your help.

I have several thoughts regarding the overlap between peaks in chIP-seq 
analyses:

1. Could I calculate the p-value also in the following way for my example?

phyper(2577-1, 3912, totalTest-3912, 26009, lower.tail =FALSE)

Since the results are not symmetric and the experiments have equal 
weight according to my understanding, I would not be sure what is the 
right way to apply the test.

 > phyper(2577-1, 3912, 30000-3912, 26009, lower.tail =FALSE)
[1] 1
 > phyper(2577-1, 30000-26009,26009, 3912, lower.tail =FALSE)
[1] 0

2. Regarding the totalTest, I agree that probably taking only the peaks 
we see in the two experiments is an underestimation. On the other hand, 
counting the number of DNA motifs for that factor in the genome may give 
a too high number because some of the motifs are probably not functional 
and appear in the genome by chance. I admit that it is easier 
criticizing than founding a solution and I have not found a solution I 
am happy with.

Any ideas or comments will be highly appreciated.

Esti

Ester Feldmesser, Ph.D.
Bioinformatics Unit, Department of Biological Services
Weizmann Institute of Science
Levine Building, Room 110
phone: +972-8-934-2614
email: ester.feldmesser at weizmann.ac.il

He who thinketh he leadeth and hath no one following him is only taking a walk.
Anonymous



On 12/7/2010 11:42 PM, Zhu, Lihua (Julie) wrote:
> I want to take this opportunity to thank Noah to share his insights and
> experience using the ChIPpeakAnno package.
>
> Ester, here is how the p-value is calculated for overlapping using your
> given example, phyper(2577-1, 3912, totalTest-3912, 26009, lower.tail =
> FALSE).
>
> Best regards,
>
> Julie
>
>
> On 12/7/10 2:42 AM, "Ester Feldmesser"<ester.feldmesser at weizmann.ac.il>
> wrote:
>
>    
>> Hello Noah,
>>
>> I read the archives, but still there are some points that are not clear
>> to me.
>>
>> 1. How is the hypergeometric test implemented, in other words if we use
>> the phyper R function,
>> <http://127.0.0.1:26076/library/stats/html/Hypergeometric.html>what woud
>> be p, m and k in the example given below.
>>
>> 2. Has somebody any additional idea how to calculate the totalTest when
>> comparing between the two different transcription factor peaks?
>>
>> 3. Is there any other statistical test to calculate significance between
>> overlaping peaks?
>>
>> Thanks,
>>
>> Esti
>>
>> Ester Feldmesser, Ph.D.
>> Bioinformatics Unit, Department of Biological Services
>> Weizmann Institute of Science
>> Levine Building, Room 110
>> phone: +972-8-934-2614
>> email: ester.feldmesser at weizmann.ac.il
>>
>> He who thinketh he leadeth and hath no one following him is only taking a
>> walk.
>> Anonymous
>>
>>
>>
>> On 12/6/2010 9:16 PM, Noah Dowell wrote:
>>      
>>> Hello Ester,
>>>
>>> Did you search the archives?  I commented on your question extensively and
>>> Julie has also offered helpful insight and those messages are in the
>>> archives.
>>>
>>> Best,
>>>
>>> Noah
>>>
>>>
>>> On Dec 6, 2010, at 4:09 AM, Ester Feldmesser wrote:
>>>
>>>
>>>        
>>>> Hello,
>>>>
>>>> I would like to understand how the hypergeometric test is applied in the
>>>> makeVennDiagram function, specifically what is the total, the sample and the
>>>> success groups.
>>>>
>>>> Let's say we have two peak bed files with 3912 and 26009 peaks respectively
>>>> and an overlap of 2577 peaks, how in this case should the test be applied?
>>>>
>>>> Thank you,
>>>>
>>>> Ester Feldmesser
>>>>
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>>          
>>>
>>>        
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>      
>
>



More information about the Bioconductor mailing list