[BioC] Please add me to the Dropbox containing the vignette data

Blum, Roy Roy.Blum at nyumc.org
Thu Feb 13 23:08:39 CET 2014


Dear Rory,

Thanks a lot for your clarifying response! 
It helps a lot for understanding your pipeline. 

If I understand correctly - since dba.report calculates fold changes by computing log2 normalized counts in the first condition minus the log2 normalized counts in the second condition (across each of the peaks presented by the two conditions - in case that minOverlap was set as "=1") - then even in the case of 'condition-exclusive' peaks (with zero tags in the peak location) we would still get a fold-change value, simply since we'll have a log2-normalized value minus zero, which would be equal to the log2 normalized value. Am I correct on this? This aspect wasn't very clear..  

In addition, if I understand correctly - in case of using minOverlap=2 (for analysis that employs one sample per each condition, across two conditions) would tell DiffBind to ignore all the condition-exclusive peaks and to perform calculations only on the overlapping peaks? Am I correct on this?

Finally, how does DiffBind define overlapping peaks? Is there a way to redefine this criteria? (for example based on overlap of 1bp vs. overlap of 50% of each peak span, etc.) 

Thanks a lot!!
Roy

--
Roy Blum, Ph.D.
Senior Research Scientist
Cancer Institute, Smilow Research Building,
New York University School of Medicine,
12th Floor, Room 1206
552 First Ave.
New York, NY, 10016
Mob:   +1 (646)-716-2875
Lab:    +1 (212)-263-2327
http://blumroy.googlepages.com

________________________________________
From: Rory Stark [Rory.Stark at cruk.cam.ac.uk]
Sent: Thursday, February 13, 2014 3:26 PM
To: Blum, Roy
Cc: Gordon Brown; bioconductor at r-project.org
Subject: Re: Please add me to the Dropbox containing the vignette data

Hi Roy-

First, I am obliged to discourage you from doing this type of analysis
without replicates, for two reasons: 1) it is not good science, as
biological and experimental variability is high in these types of
experiments, and your samples may not be representative; and 2) because
the statistical techniques that DiffBind relies on (embodied in the edgeR,
DESeq, and DESeq2 packages) require replication to properly calculate
confidence statistics.

Technically, DiffBind will handle this comparison. You may want to do some
simpler overlaps (dba.plotVenn, dba.overlap) to detect regions identified
as enriched in only one condition. If you want to compute fold changes
based on read counts, you can call dba.count with minOverlap=1, which will
include all the called peaks including those that do not overlap. Then set
up a contrast using dba.contrast with one condition as group1 and the
other as group2 (you will be warned again about the lack of replication).
You can call dba.analyze (again, the underlying method is likely to issue
a warning relating to the lack of replication) to do the comparison, then
call dba.report with th=1 to get all the fold changes, computed as the
log2 normalized counts in the first condition minus the log2 normalized
counts in the second condition for each interval. This report will also
include confidence statistics that you probably shouldn't take very
seriously for the reasons described above.

Cheers-
Rory

On 13/02/2014 19:16, "Blum, Roy" <Roy.Blum at nyumc.org> wrote:

>Dear Gord and Rory,
>
>I am exploring your DiffBind software and would like to inquire regarding
>the following -
>
>I would refer to a very simple scenario in which DiffBind is loaded with
>data of one histone mark tested across two conditions - before and after
>treatment (no replicates for any of the conditions).
>
>Would it be still possible to draw the basic analysis presented in the
>tutorial?
>
>In general -  would condition-specific peaks (that do not overlap with a
>corresponding peak in the other condition) be still considered as part of
>the statistical analysis performed by DiffBind? Or, does the statistical
>analysis limited only to the 'shared peaks' and reports on affinity
>changes only within 'shared' peaks (which shared within the two
>conditions)?
>Is there a way that DiffBind can report on all the condition-exclusive
>peaks (ones that are deposited only in one condition but have zero
>deposition in the other?) - how would the fold change difference be
>calculated in such events?
>
>
>Thanks a lot!
>Roy
>--
>Roy Blum, Ph.D.
>Senior Research Scientist
>Cancer Institute, Smilow Research Building,
>New York University School of Medicine,
>12th Floor, Room 1206
>552 First Ave.
>New York, NY, 10016
>Mob:   +1 (646)-716-2875
>Lab:    +1 (212)-263-2327
>http://blumroy.googlepages.com
>
>________________________________________
>From: Blum, Roy
>Sent: Thursday, February 13, 2014 10:01 AM
>To: Gordon Brown
>Subject: RE: Please add me to the Dropbox containing the vignette data
>
>Hi Gord,
>
>Thanks for you reply and for the wonderful DiffBind tool!
>
>I've got the link for the data files from Rory by now.
>Btw, this is the link:
>https://www.dropbox.com/s/bqxnqhvr7sol1za/DiffBindVignette.zip
>in case that someone inquires for it in the future.
>
>Best wishes!
>Roy
>
>--
>Roy Blum, Ph.D.
>Senior Research Scientist
>Cancer Institute, Smilow Research Building,
>New York University School of Medicine,
>12th Floor, Room 1206
>552 First Ave.
>New York, NY, 10016
>Mob:   +1 (646)-716-2875
>Lab:    +1 (212)-263-2327
>http://blumroy.googlepages.com
>
>________________________________________
>From: Gordon Brown [Gordon.Brown at cruk.cam.ac.uk]
>Sent: Thursday, February 13, 2014 9:24 AM
>To: Blum, Roy
>Subject: Re: Please add me to the Dropbox containing the vignette data
>
>Hi, Roy,
>
>Sorry for the slow response.  As far as I know, the data should be
>publicly visible, so I suspect the error was just a transient error.  Can
>you re-try?  (Or maybe Rory has already responded, in which case ignore
>this...).
>
>Cheers,
>
> - Gord
>
>
>On 2014-02-10 18:11, "Blum, Roy" <Roy.Blum at nyumc.org> wrote:
>
>>Dear Gordon,
>>
>>
>>I am currently interested in learning how to use your DiffBind software.
>>
>>
>>Would you kindly add me to the Dropbox containing the vignette data?
>>
>>
>>My attempt to execute the command line:
>>source(file.path(system.file("extra",
>>package="DiffBind"),"tamoxifen_GEO.R"))
>>failed ....
>>
>>Here's the output which was plotted on my R screen:
>>Thanks a lot in advance!  (Rory Stark seems to be away..)
>>
>>
>>Roy Blum
>>
>>The email address which I use for my Dropbox activity is:
>>blumroy at gmail.com (please add this email address as well!, Thanks!)
>>
>>
>>
>>
>>
>>> source(file.path(system.file("extra",
>>>package="DiffBind"),"tamoxifen_GEO.R"))
>>Loading required package: Biobase
>>Welcome to Bioconductor
>>
>>
>>    Vignettes contain introductory material; view with
>>'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")', and
>>for
>>    packages 'citation("pkgname")'.
>>
>>
>>
>>
>>Attaching package: ŒBiobase¹
>>
>>
>>The following object is masked _by_ Œ.GlobalEnv¹:
>>
>>
>>    exprs
>>
>>
>>Setting options('download.file.method.GEOquery'='auto')
>>[1] "ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798430/suppl/"
>>trying URL
>>'ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798430/suppl//GSM798
>>4
>>30_SLX-2645.443.s_5_SLX-2577.443.s_8_peaks.txt.gz'
>>ftp data connection made, file length 889489 bytes
>>opened URL
>>downloaded 868 Kb
>>
>>
>>[1] "ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798431/suppl/"
>>trying URL
>>'ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798431/suppl//GSM798
>>4
>>31_SLX-2576.443.s_7_SLX-2577.443.s_8_peaks.txt.gz'
>>ftp data connection made, file length 863440 bytes
>>opened URL
>>downloaded 843 Kb
>>
>>
>>[1] "ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798443/suppl/"
>>No supplemental files found
>>[1] "ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798440/suppl/"
>>No supplemental files found
>>[1] "ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798423/suppl/"
>>trying URL
>>'ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798423/suppl//GSM798
>>4
>>23_SLX-2640.438.s_1_SLX-2574.433.s_2_peaks.txt.gz'
>>ftp data connection made, file length 1566858 bytes
>>opened URL
>>downloaded 1.5 Mb
>>
>>
>>[1] "ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798424/suppl/"
>>trying URL
>>'ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798424/suppl//GSM798
>>4
>>24_SLX-2773.448.s_1_SLX-2574.433.s_2_peaks.txt.gz'
>>ftp data connection made, file length 1047867 bytes
>>opened URL
>>downloaded 1023 Kb
>>
>>
>>[1] "ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798425/suppl/"
>>trying URL
>>'ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798425/suppl//GSM798
>>4
>>25_SLX-2943.469.s_2_SLX-2574.433.s_2_peaks.txt.gz'
>>ftp data connection made, file length 1436673 bytes
>>opened URL
>>downloaded 1.4 Mb
>>
>>
>>[1] "ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798428/suppl/"
>>trying URL
>>'ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798428/suppl//GSM798
>>4
>>28_SLX-2775.448.s_3_T47D_Input_peaks.txt.gz'
>>ftp data connection made, file length 621444 bytes
>>opened URL
>>downloaded 606 Kb
>>
>>
>>[1] "ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798429/suppl/"
>>trying URL
>>'ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798429/suppl//GSM798
>>4
>>29_SLX-2867.466.s_6_T47D_Input_peaks.txt.gz'
>>ftp data connection made, file length 508000 bytes
>>opened URL
>>downloaded 496 Kb
>>
>>
>>[1] "ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798442/suppl/"
>>No supplemental files found
>>[1] "ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798432/suppl/"
>>trying URL
>>'ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798432/suppl//GSM798
>>4
>>32_SLX-3229.521.s_5_SLX-1651.307.s_1_peaks.txt.gz'
>>ftp data connection made, file length 1099858 bytes
>>opened URL
>>downloaded 1.0 Mb
>>
>>
>>[1] "ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798433/suppl/"
>>trying URL
>>'ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798433/suppl//GSM798
>>4
>>33_SLX-3230.526.s_4_SLX-3231.526.s_5_peaks.txt.gz'
>>Error in download.file(file.path(url, i), destfile = file.path(storedir,
>>:
>>  cannot open URL
>>'ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM798nnn/GSM798433/suppl//GSM798
>>4
>>33_SLX-3230.526.s_4_SLX-3231.526.s_5_peaks.txt.gz'
>>
>>
>>
>>--
>>Roy Blum, Ph.D.
>>Senior Research Scientist
>>Cancer
>> Institute, Smilow Research Building,
>>New York University School of Medicine,
>>12th Floor, Room 1206
>>552 First Ave.
>>New York, NY, 10016
>>Mob:   +1 (646)-716-2875
>>Lab:    +1 (212)-263-2327
>>http://blumroy.googlepages.com <http://blumroy.googlepages.com/>
>> <http://blumroy.googlepages.com/>
>>
>>
>>________________________________________
>>From: Rory Stark [Rory.Stark at cruk.cam.ac.uk]
>>Sent: Monday, February 10, 2014 11:39 AM
>>To: Blum, Roy
>>Subject: Automatic reply: Please add me to the Dropbox containing the
>>vignette data
>>
>>
>>I am out of the office until 3 January. If it is urgent, please contact
>>Matt Eldridge.
>>
>>
>>
>>
>>
>>------------------------------------------------------------
>>This email message, including any attachments, is for the sole use of the
>>intended recipient(s) and may contain information that is proprietary,
>>confidential, and exempt from disclosure under applicable law. Any
>>unauthorized review, use, disclosure, or distribution
>> is prohibited. If you have received this email in error please notify
>>the sender by return email and delete the original message. Please note,
>>the recipient should check this email and any attachments for the
>>presence of viruses. The organization accepts no
>> liability for any damage caused by any virus transmitted by this email.
>>=================================
>>
>>
>
>
>------------------------------------------------------------
>This email message, including any attachments, is for the sole use of the
>intended recipient(s) and may contain information that is proprietary,
>confidential, and exempt from disclosure under applicable law. Any
>unauthorized review, use, disclosure, or distribution is prohibited. If
>you have received this email in error please notify the sender by return
>email and delete the original message. Please note, the recipient should
>check this email and any attachments for the presence of viruses. The
>organization accepts no liability for any damage caused by any virus
>transmitted by this email.
>=================================
>


------------------------------------------------------------
This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain information that is proprietary, confidential, and exempt from disclosure under applicable law. Any unauthorized review, use, disclosure, or distribution is prohibited. If you have received this email in error please notify the sender by return email and delete the original message. Please note, the recipient should check this email and any attachments for the presence of viruses. The organization accepts no liability for any damage caused by any virus transmitted by this email.
=================================



More information about the Bioconductor mailing list