[BioC] Problem reading VCF file using readVcf from package VariantAnnotation

Ulrich Bodenhofer bodenhofer at bioinf.jku.at
Wed Apr 24 13:53:34 CEST 2013


Hi,

I am trying to read genotype data from a large VCF file using the 
readVcf() function from the VariantAnnotation package. I am not reading 
the entire file (which would crash my R session because of a lack of 
memory). Instead, I am reading bunches of SNV data located in 200kbp 
regions which I specify by passing a GRanges object to ScanVcfParams() 
first. No matter what I do, I get the following error message:

    when the supplied 'genome' vector is named, the names must match the 
seqnames

As far as I can make sense of this message, it seems that there is some 
mismatch between the genome characteristics in my GRanges object and the 
genome characteristics in the VCF file. I dissected the R object 
returned by scanVcfHeader() and indeed found some interesting 
mismatches: The genome in the VCF file is denoted as "b37" and the 
sequence names are not 100% compatible with hg19. The lengths of 
chromosomes 1-22, X, and Y do match, but the lengths of mitochondrial 
DNA (denoted "M" in gh19 and "MT" in b37) differ by 2. So I forced my 
GRanges object to be 100% compatible with the information stored in the 
VCF file (by copying seqlevels, genome, and seqlengths) and restricted 
my analysis to chromosomes 1-22 and X. However, I still get the same 
error message.

I also tried to locate the error message in the source code of the 
VariantAnnotation package to understand better what the problem is, but 
I could not find it. It seems the message is produced by a function that 
VariantAnnotation calls from another package.

Any idea?

Thanks in advance and best regards,
Ulrich


------------------------------------------------------------------------
*Dr. Ulrich Bodenhofer*
Associate Professor
Institute of Bioinformatics

*Johannes Kepler University*
Altenberger Str. 69
4040 Linz, Austria

Tel. +43 732 2468 4526
Fax +43 732 2468 4539
bodenhofer at bioinf.jku.at <mailto:bodenhofer at bioinf.jku.at>
http://www.bioinf.jku.at/ <http://www.bioinf.jku.at>



More information about the Bioconductor mailing list