[BioC] error in locateVariants for a GRanges object

Valerie Obenchain vobencha at fhcrc.org
Sun Mar 11 21:02:05 CET 2012


Hi Francesco,

Looks like you've hit a bug. Is your GRanges too large to attach or make 
available for testing?

Valerie

On 03/11/12 12:29, Lescai, Francesco wrote:
> but.. I have another different error with a different dataset.
>
> k1.ranges = GRanges(
>    seqnames=paste("chr",CEUstats.variants$chromosome,sep=""),
>    IRanges(start=CEUstats.variants$position,
>            width=1)
>    )
>
>> head(k1.ranges)
> GRanges with 6 ranges and 0 elementMetadata cols:
>        seqnames             ranges strand
>           <Rle>           <IRanges>   <Rle>
>    [1]     chr1 [1177919, 1177919]      *
>    [2]     chr1 [1234763, 1234763]      *
>    [3]     chr1 [1246257, 1246257]      *
>    [4]     chr1 [1564953, 1564953]      *
>    [5]     chr1 [1887112, 1887112]      *
>    [6]     chr1 [1900107, 1900107]      *
>    ---
>    seqlengths:
>      chr1 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 ... chr22  chr3  chr4  chr5  chr6  chr7  chr8  chr9
>        NA    NA    NA    NA    NA    NA    NA    NA    NA ...    NA    NA    NA    NA    NA    NA    NA    NA
>
>> k1.locations = locateVariants(k1.ranges, txdb19)
> Error in DataFrame(queryID = which(intergenic), location = location, txID = NA_integer_,  :
>    different row counts implied by arguments
>
> sessionInfo is the same as below.
> thanks very much,
>
> Francesco
>
>
> On 11 Mar 2012, at 18:53, Lescai, Francesco wrote:
>
> Adopted the suggestion of Steve and went the "hard" way of re-compiling from source the packages in my session.
> not it seems to work :-))
> therefore, no idea where the problem was but at least it is solved!
> VariantAnnotation is no .63 instead of .61, that might have changed few things together with the other packages.
>
> this is my session in case it might be useful for the developers.
>
> thanks very much for your help!!
> Francesco
>
>
> sessionInfo()
> R Under development (unstable) (2012-01-20 r58146)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] C/en_US.UTF-8/C/C/C/C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] TxDb.Hsapiens.UCSC.hg19.knownGene_2.6.4 GenomicFeatures_1.7.30
> [3] AnnotationDbi_1.17.27                   Biobase_2.15.4
> [5] VariantAnnotation_1.1.63                Rsamtools_1.7.40
> [7] Biostrings_2.23.6                       GenomicRanges_1.7.34
> [9] IRanges_1.13.30                         BiocGenerics_0.1.14
>
> loaded via a namespace (and not attached):
> [1] BSgenome_1.23.4    DBI_0.2-5          Matrix_1.0-4       RCurl_1.91-1       RSQLite_0.11.1
> [6] XML_3.9-4          biomaRt_2.11.1     bitops_1.0-4.1     ggplot2_0.8.9      grid_2.15.0
> [11] lattice_0.20-0     plyr_1.7.1         rtracklayer_1.15.7 snpStats_1.5.5     splines_2.15.0
> [16] stats4_2.15.0      survival_2.36-12   tools_2.15.0       zlibbioc_1.1.1
>
>
> On 11 Mar 2012, at 18:34, Martin Morgan wrote:
>
> On 03/11/2012 10:39 AM, Lescai, Francesco wrote:
> Hi, this is the traceback output.
>
> traceback()
> 12: stop(gettextf("invalid names for slots of class %s: %s", dQuote(Class),
>         paste(snames[is.na(which)], collapse = ", ")), domain = NA)
> 11: initialize(value, ...)
> 10: initialize(value, ...)
> 9: new("RangesMatching", matchMatrix = matchMatrix, DIM = DIM)
> 8: .local(query, subject, maxgap, minoverlap, type, select, ...)
> 7: findOverlaps(query, unlistSubject, maxgap = maxgap, type = type,
>        select = "all", ignore.strand = ignore.strand)
> 6: findOverlaps(query, unlistSubject, maxgap = maxgap, type = type,
>        select = "all", ignore.strand = ignore.strand)
> 5: .local(query, subject, maxgap, minoverlap, type, select, ...)
> 4: findOverlaps(queryAdj, cdsByTx, type = "within")
> 3: findOverlaps(queryAdj, cdsByTx, type = "within")
>
> 'cdsByTx' isn't used in this context in VariantAnnotation 1.1.61, which has two lines like
>
>   cdsCO<- countOverlaps(query, cache[["cdsByTx"]], type="within")
>   txFO<- findOverlaps(query, cache[["tx"]], type="within")
>
> that might be the current implementation. This line
>
> cdsFO<- findOverlaps(queryAdj, cdsByTx, type="within")
>
> _is_ in VariantAnnotation 1.0.5; I think you are getting the wrong version of VariantAnnotation, but this is not consistent with your sessionInfo().
>
> Martin
>
>
> 2: locateVariants(my.ranges, txdb19)
> 1: locateVariants(my.ranges, txdb19)
>
> I tried to install the package, but it seems it still picks up the old version.
>
> biocLite("TxDb.Hsapiens.UCSC.hg19.knownGene")
> BioC_mirror: http://bioconductor.org
> Using R version 2.15, BiocInstaller version 1.3.7.
> Installing package(s) 'TxDb.Hsapiens.UCSC.hg19.knownGene'
> Installing package(s) into Œ/Library/Frameworks/R.framework/Versions/2.15/Resources/library‚
> (as Œlib‚ is unspecified)
> Warning: unable to access index for repository http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/leopard/contrib/2.15
> trying URL 'http://bioconductor.org/packages/2.10/data/annotation/bin/macosx/leopard/contrib/2.15/TxDb.Hsapiens.UCSC.hg19.knownGene_2.6.2.tgz'
> Content type 'application/x-gzip' length 16722765 bytes (15.9 Mb)
> opened URL
> ==================================================
> downloaded 15.9 Mb
>
> and I checked on the website for the 2.10 release and the Mac version of the packages seems to be still 2.6.2.
>
> Is there any other package I can try to install manually?
> It seems now I cannot access to the developer wiki of BioC.
>
> thanks
> Francesco
>
>
>
>
>
> On 10 Mar 2012, at 19:01, Martin Morgan wrote:
>
> On 03/10/2012 10:14 AM, Lescai, Francesco wrote:
> Thanks Martin,
> done, but I still get the same error.
>
> I can't spot the problem; maybe someone else will chime in.
>
> (a) TxDb.Hsapiens... is still out-of-date; maybe it isn't checked by biocLite()
>
> (b) the error
>
> my.locations = locateVariants(my.ranges, txdb19)
> Error in initialize(value, ...) :
>    invalid names for slots of class „RangesMatching‰: matchMatrix, DIM
>
> definitely looks like an 'old package' issue -- the RangesMatching class was replaced by the 'Hits' class during this release cycle. It might help to call
>
> traceback()
>
> after the error, and to confirm that you are accessing only functions defined in the loaded packages by starting your R session with
>
> R --vanilla
>
> Obviously, the sessionInfo() needs to reflect the session the command fails in not, e.g., R gui in one instance and the terminal in the other.
>
> Martin
>
>
> My new sessionInfo is
>
> sessionInfo()
> R Under development (unstable) (2012-01-20 r58146)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] C/en_US.UTF-8/C/C/C/C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] BiocInstaller_1.3.7                     TxDb.Hsapiens.UCSC.hg19.knownGene_2.6.2
> [3] GenomicFeatures_1.7.30                  VariantAnnotation_1.1.61
> [5] Rsamtools_1.7.38                        Biostrings_2.23.6
> [7] AnnotationDbi_1.17.27                   Biobase_2.15.4
> [9] GenomicRanges_1.7.33                    IRanges_1.13.28
> [11] BiocGenerics_0.1.12                     biomaRt_2.11.1
>
> loaded via a namespace (and not attached):
> [1] BSgenome_1.23.4    DBI_0.2-5          Matrix_1.0-4       RCurl_1.91-1       RSQLite_0.11.1
> [6] XML_3.9-4          bitops_1.0-4.1     ggplot2_0.8.9      grid_2.15.0        lattice_0.20-0
> [11] plyr_1.7.1         rtracklayer_1.15.7 snpStats_1.5.5     splines_2.15.0     survival_2.36-12
> [16] tools_2.15.0       zlibbioc_1.1.1
>
>
> On 10 Mar 2012, at 17:50, Martin Morgan wrote:
>
> On 03/10/2012 09:39 AM, Lescai, Francesco wrote:
> Hi there,
> maybe I'm just doing a silly error somewhere, but I get an error when trying to locate the variants from a GRanges object.
> I have a file with SNP positions, thefore I build up the GRanges this way
>
> my.ranges = GRanges(
>   seqnames=paste("chr", my.snp.unique$chromosome, sep=""),
>   IRanges(start= my.snp.unique$position,
>           width=1))
>
> head(my.ranges)
> GRanges with 6 ranges and 0 elementMetadata values:
>       seqnames               ranges strand
>          <Rle>                <IRanges>      <Rle>
>   [1]     chr1 [ 1323144,  1323144]      *
>   [2]     chr1 [ 3544236,  3544236]      *
>   [3]     chr1 [ 6252966,  6252966]      *
>   [4]     chr1 [ 7861154,  7861154]      *
>   [5]     chr1 [10425118, 10425118]      *
>   [6]     chr1 [10502308, 10502308]      *
>   ---
>   seqlengths:
>     chr1 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 ...  chr4  chr5  chr6  chr7  chr8  chr9  chrX  chrY
>       NA    NA    NA    NA    NA    NA    NA    NA    NA ...    NA    NA    NA    NA    NA    NA    NA    NA
>
> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
> txdb19<- TxDb.Hsapiens.UCSC.hg19.knownGene
> #
> my.locations = locateVariants(my.ranges, txdb19)
> Error in initialize(value, ...) :
>   invalid names for slots of class „RangesMatching‰: matchMatrix, DIM
>
> What am I doing wrong?
>
> Your devel packages are out of date, so I'd start with
>
> source("http://bioconductor.org/biocLite.R")
> biocLite(character())
>
> Martin
>
>
> thanks,
> Francesco
>
>
> sessionInfo()
> R Under development (unstable) (2012-01-20 r58146)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] C/en_US.UTF-8/C/C/C/C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] TxDb.Hsapiens.UCSC.hg19.knownGene_2.6.2 GenomicFeatures_1.6.7
> [3] VariantAnnotation_1.1.33                Rsamtools_1.7.27
> [5] Biostrings_2.23.6                       AnnotationDbi_1.17.15
> [7] Biobase_2.15.3                          GenomicRanges_1.7.16
> [9] IRanges_1.13.22                         BiocGenerics_0.1.4
> [11] biomaRt_2.11.1
>
> loaded via a namespace (and not attached):
> [1] BSgenome_1.23.2    DBI_0.2-5          Matrix_1.0-3       RCurl_1.9-5        RSQLite_0.11.1
> [6] XML_3.8-0          bitops_1.0-4.1     ggplot2_0.8.9      grid_2.15.0        lattice_0.20-0
> [11] plyr_1.7.1         rtracklayer_1.15.7 snpStats_1.5.3     splines_2.15.0     survival_2.36-10
> [16] tools_2.15.0       zlibbioc_1.1.1
>
> ---------------------------------------------------------------------------------
> Francesco Lescai, PhD, EDBT
> Senior Research Associate in Genome Analysis
> University College London
> Faculty of Population Health Sciences
> Dept. Genes, Development&     Disease
> ICH - Molecular Medicine Unit, GOSgene team
> 30 Guilford Street
> WC1N 1EH London UK
>
> email: f.lescai at ucl.ac.uk<mailto:f.lescai at ucl.ac.uk><mailto:f.lescai at ucl.ac.uk><mailto:f.lescai at ucl.ac.uk><mailto:f.lescai at ucl.ac.uk><mailto:f.lescai at ucl.ac.uk>
> phone: +44.(0)207.905.2274
> [ext: 2274]
> --------------------------------------------------------------------------------
>
>
> [[alternative HTML version deleted]]
>
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org<mailto:Bioconductor at r-project.org><mailto:Bioconductor at r-project.org><mailto:Bioconductor at r-project.org><mailto:Bioconductor at r-project.org>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>
> Location: M1-B861
> Telephone: 206 667-2793
>
>
> ---------------------------------------------------------------------------------
> Francesco Lescai, PhD, EDBT
> Senior Research Associate in Genome Analysis
> University College London
> Faculty of Population Health Sciences
> Dept. Genes, Development&    Disease
> ICH - Molecular Medicine Unit, GOSgene team
> 30 Guilford Street
> WC1N 1EH London UK
>
> email: f.lescai at ucl.ac.uk<mailto:f.lescai at ucl.ac.uk><mailto:f.lescai at ucl.ac.uk><mailto:f.lescai at ucl.ac.uk>
> phone: +44.(0)207.905.2274
> [ext: 2274]
> --------------------------------------------------------------------------------
>
>
> [[alternative HTML version deleted]]
>
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org<mailto:Bioconductor at r-project.org><mailto:Bioconductor at r-project.org>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>
> Location: M1-B861
> Telephone: 206 667-2793
>
>
> ---------------------------------------------------------------------------------
> Francesco Lescai, PhD, EDBT
> Senior Research Associate in Genome Analysis
> University College London
> Faculty of Population Health Sciences
> Dept. Genes, Development&   Disease
> ICH - Molecular Medicine Unit, GOSgene team
> 30 Guilford Street
> WC1N 1EH London UK
>
> email: f.lescai at ucl.ac.uk<mailto:f.lescai at ucl.ac.uk><mailto:f.lescai at ucl.ac.uk>
> phone: +44.(0)207.905.2274
> [ext: 2274]
> --------------------------------------------------------------------------------
>
>
> [[alternative HTML version deleted]]
>
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org<mailto:Bioconductor at r-project.org>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>
> Location: M1-B861
> Telephone: 206 667-2793
>
>
> ---------------------------------------------------------------------------------
> Francesco Lescai, PhD, EDBT
> Senior Research Associate in Genome Analysis
> University College London
> Faculty of Population Health Sciences
> Dept. Genes, Development&  Disease
> ICH - Molecular Medicine Unit, GOSgene team
> 30 Guilford Street
> WC1N 1EH London UK
>
> email: f.lescai at ucl.ac.uk<mailto:f.lescai at ucl.ac.uk>
> phone: +44.(0)207.905.2274
> [ext: 2274]
> --------------------------------------------------------------------------------
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> ---------------------------------------------------------------------------------
> Francesco Lescai, PhD, EDBT
> Senior Research Associate in Genome Analysis
> University College London
> Faculty of Population Health Sciences
> Dept. Genes, Development&  Disease
> ICH - Molecular Medicine Unit, GOSgene team
> 30 Guilford Street
> WC1N 1EH London UK
>
> email: f.lescai at ucl.ac.uk<mailto:f.lescai at ucl.ac.uk>
> phone: +44.(0)207.905.2274
> [ext: 2274]
> --------------------------------------------------------------------------------
>
>
> 	[[alternative HTML version deleted]]
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list