[BioC] DiffBind for Galaxy

Rory Stark Rory.Stark at cruk.cam.ac.uk
Fri Sep 27 19:16:38 CEST 2013


Hi Christoph-

We've added this feature to DiffBind. All you have to do is to specify
"readFormat=DBA_READS_BED" in dba.count().You can also use
readFormat=DBA_READS_BAM if you ever have BAM files, but for now all the
files need to be in the same format if you are not using the file
extension as the format indicator. We also added some format checks and
better error messages and warnings.

This is in DiffBind version 1.7.5 onwards, so you can get it now if you
are using the Development build (2.13). It will be in the official release
version in a few weeks.


Cheers-
Rory

On 20/09/2013 16:05, "Christoph Grunau" <christoph.grunau at univ-perp.fr>
wrote:

>Hi Roy,
>
>great! I really appreciate this and I'm looking forward to the new
>release.
>
>many many thanks - Christoph
>
>On 17 Sep 2013, at 18:17, Rory Stark <Rory.Stark at cruk.cam.ac.uk> wrote:
>
>> Hi Christoph-
>> 
>> Are the aligned read files you are using actually in BED format (chr,
>> start, end, name, score, strand)?
>> 
>> I'll look into adding an option to allow a .dat extension as an
>> alternative to .bed; if it is easy (and Gord has time) I'll try to have
>>it
>> in the 2.13 release.
>> 
>> Cheers-
>> Rory
>> 
>> 
>> On 17/09/2013 16:23, "Christoph Grunau" <christoph.grunau at univ-perp.fr>
>> wrote:
>> 
>>> Dear Rory,
>>> 
>>> thank you very much for your DiffBind software. It s actually the only
>>> software that allows us to identify differentially modified regions in
>>> ChIPseq against histone isoforms. I use it a lot and I would like to
>>> share it with my colleagues via our local Galaxy instance.
>>>Unfortunately,
>>> there is one mayor obstacle. As you will know, galaxy files have the
>>> uniform .dat suffixe. However for DiffBind bamReads and bamControl file
>>> format is detected via suffixes that must be bed, bam or bed.gz
>>> 
>>> dba.count
>>> 
>>> Count reads in binding site intervals
>>> 
>>> Description
>>> Counts reads in binding site intervals. Files must be one of bam, bed
>>>and
>>> gzip-compressed bed. File
>>> suffixes must be ".bam", ".bed", or ".bed.gz" respectively.
>>> 
>>> I solved this issue by creating a copy of the .dat file with suffix
>>>.bed.
>>> However, this slows down the analysis and generates redundancy for
>>>files
>>> that are relatively large. Would it be possible to modify your code so
>>> that the file format of bamReads and bamControl can be transmitted as
>>> parameter similar to what is used for loading the peakset?
>>> I know I ask a lot but this modification would be tremendously helpful.
>>> 
>>> Many thanks and very best regards
>>> 
>>> Christoph Grunau
>>> 
>>> 
>>> --
>>> Christoph Grunau
>>> Prof. des Universités/Professor (HDR)
>>> Université de Perpignan Via Domitia
>>> UMR 5244 CNRS Ecologie et Evolution des Interactions (2EI)
>>> 52, avenue Paul Alduy
>>> 66860 PERPIGNAN Cedex
>>> France
>>> Tel 33 (0)4.68.66.21.80
>>> Fax 33 (0)4.68.66.22.81
>>> http://2ei.univ-perp.fr/
>>> http://methdb.univ-perp.fr/epievo/
>>> 
>>> 
>>> 
>> 
>
>--
>Christoph Grunau
>Prof. des Universités/Professor (HDR)
>Université de Perpignan Via Domitia
>UMR 5244 CNRS Ecologie et Evolution des Interactions (2EI)
>52, avenue Paul Alduy
>66860 PERPIGNAN Cedex
>France
>Tel 33 (0)4.68.66.21.80
>Fax 33 (0)4.68.66.22.81
>http://2ei.univ-perp.fr/
>http://methdb.univ-perp.fr/epievo/
>
>
>



More information about the Bioconductor mailing list