[BioC] DEXSeq update

Simon Anders anders at embl.de
Mon Oct 7 08:51:20 CEST 2013


Hi Margaret

On 06/10/13 22:51, Margaret Linan wrote:
> It appears that one of my regular unsorted SAM files is truncated.

If the SAM file is unsorted, this might be annoying but okay. If it is 
sorted by position, you have a problem: All genes on chromosomes with 
high numbers will be missing and hence wrongly appear downregulated in 
this sample.

> Though the sort works fine and samtools view does not detect truncation
> in my file.sam.bam, I still get error messages after attempting to generate
> counts files.
>
> samtools view -h file.sam.bam | python dexseq_count.py file.gff -
> file.counts
>
> Do you think that the truncation may be behind the problems?

Sure, if the truncation causes some mates to be missing, htseq-count 
will complain about it. Whether you need to worry about it depends on 
whether it affects only a small fraction of the reads or more.

> I can't check the structure of my SAM.BAM file because its all symbols,
> though when I use samtools view, the contents runs across the screen, and I
> am unable to pinpoint any irregularity.

You can write the bam file back into a sam file ("samtools view 
abc_sorted.bam > abc_sorted.sam") to inspect it.

   Simon



More information about the Bioconductor mailing list