[BioC] DEXSeq: problem with dexseq_prepare_annotation.py

Simon Anders anders at embl.de
Thu Jan 17 21:45:59 CET 2013


Hi

On 17/01/13 11:15, Geetha Venkatesh wrote:
> I too have the same problem while using dexseq_prepare_annotation.py
> with UCSC annotation file.  I am getting the following error:
>> Traceback (most recent call last):
>>   File
>> "/work_space/software/external_downloaded_files/R-2.14.2/library/DEXSeq/python_scripts/dexseq_prepare_annotation.py",
>> line 91, in <module>
>>     assert l[i].iv.end <= l[i+1].iv.start, str(l[i+1]) + " starts too
>> early"
>> AssertionError: <GenomicFeature: exonic_part 'CFB' at chr6_dbb_hap3:
>> 3199308 -> 3199650 (strand '+')> starts too early
>
> Has anyone fixed this issue?  Any help would be highly appreciated.

These UCSC annotation files seem to create a lot of trouble. 
(Admittedly, this might also be cause we -- as loyal EMBL employees ;-) 
-- use Ensembl rather than UCSC for most things and hence test our 
software mainly with Ensembl data.)

Could you grep the lines concerning gene 'CFB' and post them? Maybe we 
can see something.

Maybe you could also remove all the lines regarding chromosome variants. 
If you leave only lines for "chr6" and remove everything with 
"chr6_...", this might solve the issue. (In fact, maybe try removing all 
lines form the GTF files that contain an underscore anywhere in the 
first field.) Unless you are doing something special, I guess you do not 
them anyway.

   Simon



More information about the Bioconductor mailing list