[BioC] question about makeVennDiagram

Zhu, Lihua (Julie) Julie.Zhu at umassmed.edu
Mon Mar 12 21:27:11 CET 2012


Ron,

For your dataset, I notice that one dataset contains 173122 peaks.
Therefore, set totalTest=100000 is too small. It should be large than
173122. The totalTest number is used as space size to determine if the
overlap between two datasets is more than what would be expected by chance.
So it should be equal to the number of possible binding events.

As an example, if you estimate that the total number of possible binding
sites in the genome of interest is 250000 by motif search or other methods,
then you would put totalTest=250000.

Here is a nice discussion on how to set totalTest by Noah.
https://stat.ethz.ch/pipermail/bioconductor/2010-November/036540.html

Hope this helps!

Best regards,

Julie


On 3/12/12 4:05 PM, "Ron Hart" <rhart at rci.rutgers.edu> wrote:

> Thanks, Julie.  If you notice, the second trial in my text file uses
> totalTest=100000, which should be sufficient, right?
>  
> Ron
>  
> 
> From: Zhu, Lihua (Julie) [mailto:Julie.Zhu at umassmed.edu]
> Sent: Monday, March 12, 2012 3:50 PM
> To: Ron Hart
> Cc: bioconductor at r-project.org
> Subject: Re: question about makeVennDiagram
>  
> Dear Ron,
> 
> I noticed that you specified totalTest=100 which is too small for your
> dataset. It should be larger than the largest peak number in your datasets.
> 
> Here is a list of old posts on makeVennDiagram that should help you to choose
> an appropriate number for totalTest.
> 
> ­https://stat.ethz.ch/pipermail/bioconductor/2010-June/033941.html
> ­https://stat.ethz.ch/pipermail/bioconductor/2010-November/036540.html
> ­http://comments.gmane.org/gmane.science.biology.informatics.conductor/32345
> ­https://stat.ethz.ch/pipermail/bioconductor/attachments/20100905/bb689e19/att
> achment.pl
> ­http://permalink.gmane.org/gmane.science.biology.informatics.conductor/29476
> ­ http://permalink.gmane.org/gmane.science.biology.informatics.conductor/30115
> ­http://permalink.gmane.org/gmane.science.biology.informatics.conductor/35629
> ­http://comments.gmane.org/gmane.science.biology.informatics.conductor/34765
> 
> Best regards,
> 
> Julie
> 
> 
> On 3/12/12 3:33 PM, "Ron Hart" <rhart at rci.rutgers.edu> wrote:
> Hi Julie,
>  
> Once again, thank you for writing such a useful R package!  I¹m working on
> analyzing some data for another manuscript and I¹m having trouble generating a
> p-value from makeVennDiagram.  Enclosed is a text file with sample output.
>  
> Can you help me figure out why I get NaN for the p.value?  If it helps there
> are something like 50,000 peaks per sample.
>  
> Thanks very much,
>  
> Ron
>  
>  
> 



More information about the Bioconductor mailing list