[BioC] DESeq on CCAT identified chipseq peaks

Wed May 14 18:59:25 CEST 2014

You do indeed want to form a consensus peakset from the replicates. How you do this depends on exactly what question you are trying to ask. You can take the union of all peak and count the reads for each peak in each replicate, or you use more stringent criteria in determining the consensus peakset, such as peaks that appear in at least 2 (or 3) replicates, or perhaps the union of peaks that appear in a majority of each condition (ie peaks identified in at least 3 of 5 tumors OR in at least 3 of 5 normals).

The DiffBind package provides tools to do exactly this, and the user guide/vignette walks through an example in some detail. Besides assembling consensus peaksets, DiffBind will handle the counting (with various options) and differential analysis using edgeR, DESeq, and/or DESeq2, and has convenient tools for reporting and plotting results.

Cheers-
Rory

on Wed May 14 18:16:36 CEST 2014 Aditi [guest] guest at bioconductor.org wrote:

> I plan on using DESeq downstream of CCAT identified peaks on 5 tumor and 5 normal samples and I was unsure of how to best create a > unified list of peaks and corresponding read counts -
> CCAT outputs different peak regions from each sample. Thus to create a unified list of peak regions and their read counts would you suggest -
>
> A. Taking a union of all the CCAT called peaks and calculating read count in each biological replicate OR
>
> B. Calculating the read count for each peak in each replicate whether or not it has been called in the replicate or not
>
> I saw both being suggested earlier online and I am not sure which is appropriate.
>
> 2. Since this is chipseq and not rna seq data, do you agree that using coverageBed ( coverageBed -abam $bamfile  -b $CCATpeaks > countdata) would work as > good as HTseqcount ?