[BioC] Fwd: differential binding question

Steve Lianoglou mailinglist.honeypot at gmail.com
Wed Jan 4 18:11:05 CET 2012


Hi,

Just wanted to mention that I'm also quite interested in what comes
out of the "accounting for 'copy number'" saga being discussed ...
very cool stuff.

Keep us posted!

-steve

On Wed, Jan 4, 2012 at 11:24 AM, Rory Stark <Rory.Stark at cancer.org.uk> wrote:
> That's the big question Mali! The more I think about it, the less confident I am that it will work.
>
> As I understand it you want to control for transcripts whose expression may change but whose affinity (rate at which the protein binds) stays the same. Without the control, higher expression = more transcripts = more RNA pulled down by the IP even at the same affinity. I'm not sure that subtracting the transcripts independent of binding (the RNA-Seq) will work. Besides the normalization issue relating to the RNA-Seq counts, the problem  is that there should always be more transcripts in the control (as they include both bound and unbound transcripts) than in the IP (only bound transcripts). So even if the normalization was perfect, the subtraction would always result in a negative number of counts (set to a minimum of 1 count per peak in DiffBind).
>
> In this case, subtracting the control is probably too crude. Mark Robinson, an author of edgeR, is doing some interesting work on incorporating copy number information into differential ChIP-Seq analysis, and I am adding it to DiffBind. He is able to cast the problem as a normalization issue. I'm thinking that would be a better approach to your problem: the RNA-Seq gives "copy number" information (overall transcript abundance), and this is incorporated as a normalization term, leaving  the differential analysis to identify changes in affinity.  I'm working on this right now, so if you are interested you might be a beta tester — let me know.
>
> I still think it is worth running your data in DiffBind to see how it looks as a start.
>
> Cheers-
> Rory
>
> From: mali salmon <shalmom1 at gmail.com<mailto:shalmom1 at gmail.com>>
> Date: Wed, 4 Jan 2012 15:09:11 +0000
> To: Cancer Research UK <rory.stark at cancer.org.uk<mailto:rory.stark at cancer.org.uk>>
> Cc: "bioconductor at r-project.org<mailto:bioconductor at r-project.org>" <bioconductor at r-project.org<mailto:bioconductor at r-project.org>>
> Subject: Re: [BioC] Fwd: differential binding question
>
> Thanks Rory and Heidi for replying.
> Would read subtraction is enough in order to account for the difference in gene expression?
> Mali
>
> On Wed, Jan 4, 2012 at 2:11 PM, Rory Stark <Rory.Stark at cancer.org.uk<mailto:Rory.Stark at cancer.org.uk>> wrote:
> Hi Mali-
>
> You can try this pretty easily using DiffBind. I suggest calling peaks on each IP separately (each IP and its matching RNA-Seq control) and read these four peaksets into DiffBind (you could also use two peak callers and read in all eight peaksets to identify more potential sites). DiffBind lets you derive an overall set of peaks (either a superset of all the peaks, or any that overlap in at least two [or more] peaksets), does the read counting (by default subtracting reads in the matching RNA-seq controls), runs edgeR and/or DESeq to identify differentially bound regions, and offers several plots and reports to characterize the results.
>
> A couple of caveats: With only two replicates of each condition, your power to reliably identify significant differences is limited. Also, while the IP reads will be normalized, the control reads will not be (unless you do some normalization separately prior to loading it into DiffBind). However this does seem to be a a good place to start!
>
> Cheers-
> Rory
>
> ----------------------------------------------------------------------------
> Dr. Rory Stark
>
> Principal Bioinformatics  Analyst
>
> Cambridge Research Institute - Cancer Research UK
> Robinson Way
> Cambridge CB2 0RE
> United Kingdom
>  +44 1223 404 311<tel:%2B44%201223%20404%20311>
>
> rory.stark at cancer.org.uk<mailto:rory.stark at cancer.org.uk>
> ----------------------------------------------------------------------------
>
> On 04/01/2012 13:37, "mali salmon" <shalmom1 at gmail.com<mailto:shalmom1 at gmail.com><mailto:shalmom1 at gmail.com<mailto:shalmom1 at gmail.com>>> wrote:
>
> Dear Users
> We have RNA-IP-seq for two conditions with two biological replicates each.
> So in total we have 8 samples:
> 2 for condition1 IP
> 2 for condition1 Input
> 2 for condition2 IP
> 2 for condition2 Input
> We would like to find differential binding between the two conditions which
> are not influenced from differences in gene expression (Input samples are
> actually regular RNA-seq).
> I thought of first finding peak regions (maybe by pooling all IP and all
> Input samples) and run ChIP-seq peak caller, count how many reads fall
> within these regions in each of the samples, and then run DESeq and edgeR
> in order to find differential binding.
> Is this can be done with edgeR and DESeq (again the Input is different for
> the two conditions, and we would like to cancel out differential gene
> expression)?
> Thanks
> Mali
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org<mailto:Bioconductor at r-project.org><mailto:Bioconductor at r-project.org<mailto:Bioconductor at r-project.org>>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> NOTICE AND DISCLAIMER
> This e-mail (including any attachments) is intended for ...{{dropped:18}}
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org<mailto:Bioconductor at r-project.org>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> NOTICE AND DISCLAIMER
> This e-mail (including any attachments) is intended for ...{{dropped:18}}
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list