[BioC] Subsetting "sites only" VCF objects

Richard Pearson rpearson at well.ox.ac.uk
Fri Aug 24 13:10:12 CEST 2012


Hi

It is wonderful that we can create subsets of VariantAnnotation VCF 
objects using [ but I have found that this doesn't work for VCFs that 
are "sites only", i.e. have no information in geno(vcf):
   > geno(vcf)
   SimpleList of length 0
   > passVcf <- vcf[values(fixed(vcf))[["FILTER"]] == "PASS", ]
   Error in colData(x)[j, , drop = FALSE] :
     selecting rows: subscript out of bounds

In these cases I can create subsets, e.g. using:
   passVcf <- VCF(
     rowData = rowData(vcf)[values(fixed(vcf))[["FILTER"]] == "PASS"],
     colData = colData(vcf),
     exptData = exptData(vcf),
     fixed = values(fixed(vcf))[values(fixed(vcf))[["FILTER"]] == 
"PASS", -(1)],
     info = values(info(vcf))[values(fixed(vcf))[["FILTER"]] == "PASS", 
-(1)]
   )

But it would be great if I could also do this using [. Any chance this 
functionality could be included in a future version of VariantAnnotation?

Thanks

Richard

 > sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C LC_TIME=en_GB.UTF-8        
LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8    
LC_MESSAGES=en_GB.UTF-8 LC_PAPER=C                 LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C 
LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods base

other attached packages:
[1] VEDA_0.0.1               ggplot2_0.9.1 VariantAnnotation_1.2.10 
Rsamtools_1.8.6 Biostrings_2.24.1        GenomicRanges_1.8.12 
IRanges_1.14.4           BiocGenerics_0.2.0 malariagen_0.0.1

loaded via a namespace (and not attached):
  [1] AnnotationDbi_1.18.1  Biobase_2.16.0 biomaRt_2.12.0        
bitops_1.0-4.1        BSgenome_1.24.0 colorspace_1.1-1      
DBI_0.2-5             dichromat_1.2-4 digest_0.5.2          
GenomicFeatures_1.8.2
[11] grid_2.15.0           labeling_0.1 lattice_0.20-6        
MASS_7.3-20           Matrix_1.0-6 memoise_0.1           
munsell_0.3           plyr_1.7.1 proto_0.3-9.2         RColorBrewer_1.0-5
[21] RCurl_1.91-1          reshape2_1.2.1 RSQLite_0.11.1        
rtracklayer_1.16.3    scales_0.2.1 snpStats_1.6.0        
splines_2.15.0        stats4_2.15.0 stringr_0.6.1         survival_2.36-14
[31] tools_2.15.0          XML_3.9-4 zlibbioc_1.2.0



More information about the Bioconductor mailing list