[BioC] DEXSeq update
Simon Anders
anders at embl.de
Mon Oct 7 08:51:20 CEST 2013
Hi Margaret
On 06/10/13 22:51, Margaret Linan wrote:
> It appears that one of my regular unsorted SAM files is truncated.
If the SAM file is unsorted, this might be annoying but okay. If it is
sorted by position, you have a problem: All genes on chromosomes with
high numbers will be missing and hence wrongly appear downregulated in
this sample.
> Though the sort works fine and samtools view does not detect truncation
> in my file.sam.bam, I still get error messages after attempting to generate
> counts files.
>
> samtools view -h file.sam.bam | python dexseq_count.py file.gff -
> file.counts
>
> Do you think that the truncation may be behind the problems?
Sure, if the truncation causes some mates to be missing, htseq-count
will complain about it. Whether you need to worry about it depends on
whether it affects only a small fraction of the reads or more.
> I can't check the structure of my SAM.BAM file because its all symbols,
> though when I use samtools view, the contents runs across the screen, and I
> am unable to pinpoint any irregularity.
You can write the bam file back into a sam file ("samtools view
abc_sorted.bam > abc_sorted.sam") to inspect it.
Simon
More information about the Bioconductor
mailing list