[BioC] RNASeq:- getting Zero Count

Valerie Obenchain vobencha at fhcrc.org
Mon Aug 5 17:51:52 CEST 2013


Hi Reema,

To perform overlap or matching operations the seqlevels (chromosome 
names) of the objects must match. The error message is telling you that 
some of these do not match. It's reasonable that a few names may not 
match (maybe a chromosome is present in one object and not the other) 
but the majority should.

Check the seqlevels:
seqlevels(aln)
seqlevels(gnModel)

Which names are common to both:
intersect(seqlevels(gnModel), seqlevels(aln))

  You can rename seqlevels in several different ways. See 
?renameSeqlevels or ?seqlevels for examples.

Valerie

On 08/04/2013 06:35 AM, Reema Singh wrote:
> Dear All,
>
> I am trying to extract the read count from three .bam files. But I am
> getting Zero count entries.
>
> I am using Mycobacterium Tuberculosis H37Rv gtf file (
> ftp://ftp.ensemblgenomes.org/pub/release-19/bacteria//gtf/bacteria_1_collection/mycobacterium_tuberculosis_h37rv/Mycobacterium_tuberculosis_h37rv.GCA_000277735.1.19.gtf.gz)
> and RNASeq data used here were downloaded from (
> http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE40846) and aligned
> with bowtie2.
>
>
> library(GenomicFeatures)
> txdb <-
> makeTranscriptDbFromGFF(file="Mycobacterium_tuberculosis_h37rv.GCA_000277735.1.19.gtf",format="gtf")
> saveDb(txdb,file="MycoTubeH37Rv.sqlite")
> load("MycoTubeH37Rv.sqlite")
> gnModel <- exonsBy(txdb,"gene") ### *also tried with "transcripts", "cds",
> but getting same *
>
> bamFiles <- list.files(".", "bam$", full=TRUE)
> names(bamFiles) <- sub("\\..*","",basename(bamFiles))
> counter <- function(fl, gnModel){
> aln <- GenomicRanges::readGappedAlignments(fl)
> strand(aln)
> hits <- countOverlaps(aln,gnModel)
> counts <- countOverlaps(gnModel,aln[hits==5])
> names(counts) <- names(gnModel)
> counts
> }
>
> counts <- sapply(bamFiles,counter,gnModel)
>
> Note: method with signature ‘Vector#GRangesList’ chosen for function
> ‘countOverlaps’,
>   target signature ‘GappedAlignments#GRangesList’.
>   "GappedAlignments#Vector" would also be valid
> Note: method with signature ‘GRangesList#Vector’ chosen for function
> ‘countOverlaps’,
>   target signature ‘GRangesList#GappedAlignments’.
>   "Vector#GappedAlignments" would also be valid
> Warning messages:
> 1: In .Seqinfo.mergexy(x, y) :
>    Each of the 2 combined objects has sequence levels not in the other:
>    - in 'x': gi|448814763|ref|NC_000962.3|
>    - in 'y': Chromosome
>    Make sure to always combine/compare objects based on the same reference
>    genome (use suppressWarnings() to suppress this warning).
> 2: In .Seqinfo.mergexy(x, y) :
>    Each of the 2 combined objects has sequence levels not in the other:
>    - in 'x': Chromosome
>    - in 'y': gi|448814763|ref|NC_000962.3|
>    Make sure to always combine/compare objects based on the same reference
>    genome (use suppressWarnings() to suppress this warning).
> 3: In .Seqinfo.mergexy(x, y) :
>    Each of the 2 combined objects has sequence levels not in the other:
>    - in 'x': gi|448814763|ref|NC_000962.3|
>    - in 'y': Chromosome
>    Make sure to always combine/compare objects based on the same reference
>    genome (use suppressWarnings() to suppress this warning).
> 4: In .Seqinfo.mergexy(x, y) :
>    Each of the 2 combined objects has sequence levels not in the other:
>    - in 'x': Chromosome
>    - in 'y': gi|448814763|ref|NC_000962.3|
>    Make sure to always combine/compare objects based on the same reference
>    genome (use suppressWarnings() to suppress this warning).
> 5: In .Seqinfo.mergexy(x, y) :
>    Each of the 2 combined objects has sequence levels not in the other:
>    - in 'x': gi|448814763|ref|NC_000962.3|
>    - in 'y': Chromosome
>    Make sure to always combine/compare objects based on the same reference
>    genome (use suppressWarnings() to suppress this warning).
> 6: In .Seqinfo.mergexy(x, y) :
>    Each of the 2 combined objects has sequence levels not in the other:
>    - in 'x': Chromosome
>    - in 'y': gi|448814763|ref|NC_000962.3|
>    Make sure to always combine/compare objects based on the same reference
>    genome (use suppressWarnings() to suppress this warning).
>
> head(counts)
>
>            SRR568038 SRR568039 SRR568040
> RVBD_0001         0         0         0
> RVBD_0002         0         0         0
> RVBD_0003         0         0         0
> RVBD_0004         0         0         0
> RVBD_0005         0         0         0
> RVBD_0006         0         0         0
>
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-redhat-linux-gnu (64-bit)
>
> locale:
>   [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
>   [5] LC_MONETARY=en_US.utf8    LC_MESSAGES=en_US.utf8
>   [7] LC_PAPER=C                LC_NAME=C
>   [9] LC_ADDRESS=C              LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets  methods
> [8] base
>
> other attached packages:
> [1] Rsamtools_1.12.3       Biostrings_2.28.0      GenomicFeatures_1.12.3
> [4] AnnotationDbi_1.22.6   Biobase_2.20.1         GenomicRanges_1.12.4
> [7] IRanges_1.18.2         BiocGenerics_0.6.0
>
> loaded via a namespace (and not attached):
>   [1] biomaRt_2.16.0     bitops_1.0-5       BSgenome_1.28.0    DBI_0.2-6
>
>   [5] RCurl_1.95-4.1     RSQLite_0.11.3     rtracklayer_1.20.4 stats4_3.0.1
>
>   [9] tools_3.0.1        XML_3.96-1.1       zlibbioc_1.6.0
>
>
> Kind regards
>
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list