[BioC] need additional sanity checks in TabixFile or readVcf

Valerie Obenchain vobencha at fhcrc.org
Tue Oct 1 19:27:02 CEST 2013


Hi Jeremy,

The problem here is that the positions specified in 'rng' are not in the 
bam file. In this case readVcf() should return an empty VCF - this was 
fixed in devel some time ago but not in release. This is now fixed in 
release v 1.6.8.

 > vcf <- readVcf(tab, "hg19",myparam)
 > vcf
class: CollapsedVCF
dim: 0 3
...


In your previous post you asked about supplying a VCF with corresponding 
.tbi file as a TabixFile. This should be fine. If the .tbi file does not 
exist an error will be thrown,

 > fl <- system.file("extdata", "ex2.vcf", package="VariantAnnotation")
 > tab<-TabixFile(fl, paste(fl,"bgz.tbi",sep="."))
Error: TabixFile: file(s) do not exist:
 
'/home/vobencha/R/R-rel/R-3-0-branch/library/VariantAnnotation/extdata/ex2.vcf'
 
'/home/vobencha/R/R-rel/R-3-0-branch/library/VariantAnnotation/extdata/ex2.vcf.bgz.tbi'

Because the error was the same in both situations my guess is that your 
'param' was again out of range. If this wasn't the case please provide a 
reproducible example of the TabixFile case and I'll look into it.

Valerie

On 10/01/2013 09:03 AM, Jeremy Leipzig wrote:
> well that's odd. managed to get the same error using the most kosher
> workflow I could find:
>> vcfFile <- system.file("extdata", "ex2.vcf", package="VariantAnnotation")
>> from <- vcfFile
>> to <- tempfile()
>> compressVcf <- bgzip(from, to)
>> idx <- indexTabix(compressVcf, "vcf")
>> tab <- TabixFile(compressVcf, idx)
>> rng <- GRanges(seqnames="20", ranges=IRanges(start=c(250000,
> 500000),end=c(300000,600000)))
>> myparam<-ScanVcfParam(which=rng)
>> vcf <- readVcf(tab, "hg19",myparam)
> Error in lapply(names(vcf[[1]]), function(elt) { :
>    error in evaluating the argument 'X' in selecting a method for function
> 'lapply': Error in vcf[[1]] : subscript out of bounds
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-redhat-linux-gnu (64-bit)
>
> locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=C                 LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel  stats     graphics  utils     datasets  grDevices methods
> [8] base
>
> other attached packages:
> [1] VariantAnnotation_1.6.7 Rsamtools_1.12.4        Biostrings_2.28.0
> [4] GenomicRanges_1.12.5    IRanges_1.18.4          BiocGenerics_0.6.0
>
> loaded via a namespace (and not attached):
>   [1] AnnotationDbi_1.22.6   Biobase_2.20.1         biomaRt_2.16.0
>   [4] bitops_1.0-6           BSgenome_1.28.0        DBI_0.2-7
>   [7] GenomicFeatures_1.12.4 RCurl_1.95-4.1         RSQLite_0.11.4
> [10] rtracklayer_1.20.4     stats4_3.0.1           tools_3.0.1
> [13] XML_3.98-1.1           zlibbioc_1.6.0
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list