[BioC] [VariantAnnotation] subsetting VCF objects

Paul Theodor Pyl paul.theodor.pyl at embl.de
Wed Nov 14 14:41:24 CET 2012


Hi all,

I am reading in some .vcf files with the readVcf function and realized 
that I cannot subset the resulting VCF objects if the info field is 
empty, see example below.

Is there a workaround except for loading the info at least partially?

Thanks,
Paul

The Example:
 > vcf_full = readVcf("test.vcf.gz", "hg19")
 > vcf_no_info = readVcf("test.vcf.gz", "hg19", param = ScanVcfParam( 
geno=c("GT","GQ"), fixed="ALT", info=NA ))
vcf_full
class: VCF
dim: 71128 2
genome: hg19
exptData(1): header
fixed(4): REF ALT QUAL FILTER
info(22): AC AF ... SB STR
geno(5): AD DP GQ GT PL
rownames(71128): rs62224610 rs141578542 ... 22:51243743 22:51244332
rowData values names(1): paramRangeID
colnames(2): sample_one sample_two
colData names(1): Samples
 > vcf_no_info
class: VCF
dim: 71128 2
genome: hg19
exptData(1): header
fixed(2): REF ALT
info(0):
geno(2): GQ GT
rownames(71128): rs62224610 rs141578542 ... 22:51243743 22:51244332
rowData values names(1): paramRangeID
colnames(2): sample_one sample_two
colData names(1): Samples
 > vcf_full[1:10]
class: VCF
dim: 10 2
genome: hg19
exptData(1): header
fixed(4): REF ALT QUAL FILTER
info(22): AC AF ... SB STR
geno(5): AD DP GQ GT PL
rownames(10): rs62224610 rs141578542 ... 22:16058463 rs149413786
rowData values names(1): paramRangeID
colnames(2): sample_one sample_two
colData names(1): Samples
 > vcf_no_info[1:10]
Error in slot(x, "info")[i, , drop = FALSE] :
   selecting rows: subscript contains NAs or out of bounds indices

 > sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C                 LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets methods   base

other attached packages:
[1] VariantAnnotation_1.4.3 Rsamtools_1.10.2 Biostrings_2.26.2
[4] GenomicRanges_1.10.5    IRanges_1.16.4 BiocGenerics_0.4.0

loaded via a namespace (and not attached):
  [1] AnnotationDbi_1.20.2   Biobase_2.18.0 biomaRt_2.14.0
  [4] bitops_1.0-5           BSgenome_1.26.1 compiler_2.15.2
  [7] DBI_0.2-5              GenomicFeatures_1.10.0 parallel_2.15.2
[10] RCurl_1.95-3           RSQLite_0.11.2 rtracklayer_1.18.0
[13] stats4_2.15.2          tools_2.15.2 XML_3.95-0.1
[16] zlibbioc_1.4.0



More information about the Bioconductor mailing list