[BioC] Possible problem with featureCounts() and scipen
Wei Shi
shi at wehi.EDU.AU
Wed Jun 18 02:34:35 CEST 2014
Dear Oscar,
I have fixed the bug and committed to bioc (1.14.2).
Also, you'd better refine your commands like below:
regions <- data.frame(GeneID="G1", Chr="chr2", Start=202000000, End=202002100, Strand="+")
featureCounts("MyReadsfile.bam", annot.ext=regions, ignoreDup=T)
Best wishes,
Wei
On Jun 17, 2014, at 9:00 PM, Oscar Rueda wrote:
> Dear list,
> I've noticed a problem counting reads using the Rsubread package. I can not include a reproducible example for lack of a bam file, but I hope the following code explains the problem:
>
> Suppose I have this genomic region:
>
>> regions <- data.frame(GeneID="G1", Chr="chr2", Start=202000000, End=202002100, Strand=1)
>
> Then I want to count the reads in this region:
>
>> featureCounts("MyReadsfile.bam", annot.ext=regions, annot.inbuilt="hg19", ignoreDup=T)
>
> I get a huge number of reads, but if I look at the output, I see that the annotation is wrong:
>
> $counts
> X.MyReadsfile.bam
> G1 715291
>
> $annotation
> GeneID Chr Start End Strand Length
> 1 Error chr2 2 202002100 + 202002099
>
> That is, the start is 2 instead of 202000000. If I print my regions I see
>
>> regions
> GeneID Chr Start End Strand
> 1 Error chr2 2.02e+08 202002100 1
>
> That is, the scientific notation is not taken properly.
> I can fix this doing
>> par(scipen=10)
>
> But I wanted to ask if anyone has noticed this behaviour and if this is expected by the function.
>
> Thanks a lot for your comments,
> Oscar
>
> Oscar M. Rueda, PhD.
> Postdoctoral Research Fellow, Caldas Lab, Breast Cancer Functional
> Genomics.
> University of Cambridge. Cancer Research UK Cambridge Institute.
> Li Ka Shing Centre, Robinson Way.
> Cambridge CB2 0RE
> England
>
>
>> sessionInfo()
> R version 3.1.0 (2014-04-10)
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
> locale:
> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
>
> attached base packages:
> [1] parallel stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] Rsubread_1.14.1 GenomicAlignments_1.0.1 BSgenome_1.32.0
> [4] Rsamtools_1.16.1 Biostrings_2.32.0 XVector_0.4.0
> [7] GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 IRanges_1.22.9
> [10] BiocGenerics_0.10.0
>
> loaded via a namespace (and not attached):
> [1] BatchJobs_1.2 BBmisc_1.6 BiocParallel_0.6.1 bitops_1.0-6
> [5] brew_1.0-6 codetools_0.2-8 DBI_0.2-7 digest_0.6.4
> [9] fail_1.2 foreach_1.4.2 iterators_1.0.7 plyr_1.8.1
> [13] Rcpp_0.11.2 RSQLite_0.11.4 sendmailR_1.1-2 stats4_3.1.0
> [17] stringr_0.6.2 tools_3.1.0 zlibbioc_1.10.0
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:6}}
More information about the Bioconductor
mailing list