[BioC] VariantAnnotation - MatrixToSnpMatrix - only returns NAs

Lavinia Gordon lavinia.gordon at mcri.edu.au
Wed Jan 23 02:35:46 CET 2013

Hi, I have just started working with VCF files and have discovered the VariantAnnotation package, many thanks for making these functions available.
Following the code outlined in the reference manual for MatrixToSnpMatrix, my VCF returns only NA values:
> head(geno(vcf)$GT)
           GHS008 GHS015 GHS025 GHS026 GHS027 GHS031 GHS033 GHS034 GHS036
chrM:73    "1/1"  "0/0"  "1/1"  "0/0"  "0/0"  "1/1"  "0/0"  "0/0"  "0/0" 
chrM:119   "0/0"  "0/0"  "0/0"  "1/1"  "1/1"  "0/0"  "0/0"  "0/0"  "0/0" 
rs72619361 "0/0"  "1/1"  "0/0"  "0/0"  "0/0"  "0/0"  "1/1"  "1/1"  "1/1" 
chrM:150   "1/1"  "1/1"  "1/1"  "1/1"  "1/1"  "1/1"  "1/1"  "1/1"  "1/1" 
chrM:189   "0/0"  "0/0"  "0/0"  "1/1"  "1/1"  "0/0"  "0/0"  "0/0"  "0/0" 
chrM:195   "1/1"  "1/1"  "1/1"  "0/0"  "0/0"  "1/1"  "1/1"  "1/1"  "1/1" 
> head(t(as(mat$genotype, "character")))
           GHS008 GHS015 GHS025 GHS026 GHS027 GHS031 GHS033 GHS034 GHS036
chrM:73    "NA"   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"  
chrM:119   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"  
rs72619361 "NA"   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"  
chrM:150   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"  
chrM:189   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"  
chrM:195   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"   "NA"  

I have run the reference manual code with the supplied VCF and it all looks good.
I have no reason to suspect that there is anything wrong with my VCF.
Could anyone give me any tips as to how I can troubleshoot this and work out why all the NAs are appearing?

Many thanks,

Lavinia Gordon
Senior Research Officer
Quantitative Sciences Core, Bioinformatics

Murdoch Childrens Research Institute
The Royal Children's Hospital
Flemington Road Parkville Victoria 3052 Australia 
T 03 8341 6221

> vcf
class: VCF 
dim: 4665545 9 
genome: hg19 
exptData(1): header
info(19): AC AF ... SB EFF
geno(5): AD DP GQ GT PL
rownames(4665545): chrM:73 chrM:119 ... chrUn_gl000249:14244
rowData values names(1): paramRangeID
colnames(9): GHS008 GHS015 ... GHS034 GHS036
colData names(1): Samples

> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-unknown-linux-gnu (64-bit)

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            

attached base packages:
[1] splines   stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] snpStats_1.8.1          Matrix_1.0-10           lattice_0.20-13        
 [4] survival_2.37-2         VariantAnnotation_1.4.6 Rsamtools_1.10.2       
 [7] Biostrings_2.26.2       GenomicRanges_1.10.6    IRanges_1.16.4         
[10] BiocGenerics_0.4.0      BiocInstaller_1.8.3    

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.20.3   Biobase_2.18.0         biomaRt_2.14.0        
 [4] bitops_1.0-5           BSgenome_1.26.1        DBI_0.2-5             
 [7] GenomicFeatures_1.10.1 grid_2.15.2            parallel_2.15.2       
[10] RCurl_1.95-3           RSQLite_0.11.2         rtracklayer_1.18.2    
[13] stats4_2.15.2          tools_2.15.2           XML_3.95-0.1          
[16] zlibbioc_1.4.0        

This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com

More information about the Bioconductor mailing list