[BioC] VariantAnnotation - dots in the INFO field give an error

Jarno Tuimala jtuimala at gmail.com
Tue Nov 13 14:36:46 CET 2012


Dear Vincent,

You're right! The vcf was actually successfully read and created.

So, problem solved, a user error.

Older version of the package seems to give an error, though, and since
I was running these in parallel, I mixed the two sessions. Sorry about
that.

- Jarno




On Mon, Nov 12, 2012 at 1:12 PM, Vincent Carey
<stvjc at channing.harvard.edu> wrote:
> what you reported is a warning, not an error.  did the object "vcf" get
> created?
>
> On Mon, Nov 12, 2012 at 4:39 AM, Jarno Tuimala <jtuimala at gmail.com> wrote:
>>
>> Hello!
>>
>> I have a problem reading a VCF file with the VariantAnnotation
>> package. The filtered VCF file (attached as text below) has been
>> generated with vcftools.
>>
>> This is what I tried in R and the resulting error message:
>>
>> > library(VariantAnnotation)
>> > vcf<-readVcf("vcftools.filtered.vcf", "hg19")
>>
>> Warning message:
>> In doTryCatch(return(expr), name, parentenv, handler) :
>>   record 1 (and others?) INFO '.' not found
>>
>> If I understood it correctely, the dots in the INFO column of the VCF
>> file create the problem.
>>
>> Is there an alternative way to read this vcf file and annotate it with
>> VariantAnnotation package?
>>
>> Best Regards,
>> Jarno
>>
>>
>> ----
>>
>> This is the session info:
>>
>> R version 2.15.1 Patched (2012-07-25 r59963)
>> Platform: i386-w64-mingw32/i386 (32-bit)
>>
>> locale:
>> [1] LC_COLLATE=Finnish_Finland.1252  LC_CTYPE=Finnish_Finland.1252
>> LC_MONETARY=Finnish_Finland.1252 LC_NUMERIC=C
>> LC_TIME=Finnish_Finland.1252
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] VariantAnnotation_1.4.3 Rsamtools_1.10.1        Biostrings_2.26.2
>>      GenomicRanges_1.10.2    IRanges_1.16.3
>> BiocGenerics_0.4.0
>>
>> loaded via a namespace (and not attached):
>>  [1] AnnotationDbi_1.20.2   Biobase_2.18.0         biomaRt_2.14.0
>>    bitops_1.0-4.1         BSgenome_1.26.1        DBI_0.2-5
>>  GenomicFeatures_1.10.0 parallel_2.15.1
>>  [9] RCurl_1.95-1.1         RSQLite_0.11.2         rtracklayer_1.18.0
>>    stats4_2.15.1          tools_2.15.1           XML_3.95-0.1
>>  zlibbioc_1.4.0
>>
>>
>> And this is the VCF file:
>>
>> ##fileformat=VCFv4.1
>> ##samtoolsVersion=0.1.18 (r982:295)
>> ##INFO=<ID=DP,Number=1,Type=Integer,Description="Raw read depth">
>> ##INFO=<ID=DP4,Number=4,Type=Integer,Description="# high-quality
>> ref-forward bases, ref-reverse, alt-forward and alt-reverse bases">
>> ##INFO=<ID=MQ,Number=1,Type=Integer,Description="Root-mean-square
>> mapping quality of covering reads">
>> ##INFO=<ID=FQ,Number=1,Type=Float,Description="Phred probability of
>> all samples being the same">
>> ##INFO=<ID=AF1,Number=1,Type=Float,Description="Max-likelihood
>> estimate of the first ALT allele frequency (assuming HWE)">
>> ##INFO=<ID=AC1,Number=1,Type=Float,Description="Max-likelihood
>> estimate of the first ALT allele count (no HWE assumption)">
>> ##INFO=<ID=G3,Number=3,Type=Float,Description="ML estimate of genotype
>> frequencies">
>> ##INFO=<ID=HWE,Number=1,Type=Float,Description="Chi^2 based HWE test
>> P-value based on G3">
>> ##INFO=<ID=CLR,Number=1,Type=Integer,Description="Log ratio of
>> genotype likelihoods with and without the constraint">
>> ##INFO=<ID=UGT,Number=1,Type=String,Description="The most probable
>> unconstrained genotype configuration in the trio">
>> ##INFO=<ID=CGT,Number=1,Type=String,Description="The most probable
>> constrained genotype configuration in the trio">
>> ##INFO=<ID=PV4,Number=4,Type=Float,Description="P-values for strand
>> bias, baseQ bias, mapQ bias and tail distance bias">
>> ##INFO=<ID=INDEL,Number=0,Type=Flag,Description="Indicates that the
>> variant is an INDEL.">
>> ##INFO=<ID=PC2,Number=2,Type=Integer,Description="Phred probability of
>> the nonRef allele frequency in group1 samples being larger (,smaller)
>> than in group2.">
>> ##INFO=<ID=PCHI2,Number=1,Type=Float,Description="Posterior weighted
>> chi^2 P-value for testing the association between group1 and group2
>> samples.">
>> ##INFO=<ID=QCHI2,Number=1,Type=Integer,Description="Phred scaled PCHI2.">
>> ##INFO=<ID=PR,Number=1,Type=Integer,Description="# permutations
>> yielding a smaller PCHI2.">
>> ##INFO=<ID=VDB,Number=1,Type=Float,Description="Variant Distance Bias">
>> ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
>> ##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
>> ##FORMAT=<ID=GL,Number=3,Type=Float,Description="Likelihoods for
>> RR,RA,AA genotypes (R=ref,A=alt)">
>> ##FORMAT=<ID=DP,Number=1,Type=Integer,Description="# high-quality bases">
>> ##FORMAT=<ID=SP,Number=1,Type=Integer,Description="Phred-scaled strand
>> bias P-value">
>> ##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of
>> Phred-scaled genotype likelihoods">
>> #CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT
>> HG00171 HG00174 NA18486 NA18489
>> 20      6731335 .       T       C       80.5    .       .       GT:PL:GQ
>> 1/1:0,0,0:3     1/1:0,0,0:3     1/1:0,0,0:3     1/1:113,12,0:13
>> 20      6732603 .       A       T       25.7    .       .       GT:PL:GQ
>> 0/0:0,6,54:8    0/0:0,0,0:3     0/0:0,0,0:3     0/1:58,0,27:35
>> 20      6736189 .       A       G       47.8    .       .       GT:PL:GQ
>> 0/1:0,0,0:3     0/1:0,0,0:3     0/1:0,0,0:3     1/1:79,6,0:6
>> 20      6736562 .       C       A       20.4    .       .       GT:PL:GQ
>> 0/0:0,0,0:4     0/0:0,0,0:4     0/1:53,0,32:40  0/0:0,9,98:11
>> 20      6737384 .       A       G       62      .       .       GT:PL:GQ
>> 0/1:0,0,0:3     0/1:0,0,0:3     0/1:0,0,0:3     0/1:92,0,95:92
>> 20      6737551 .       G       A       26.3    .       .       GT:PL:GQ
>> 1/1:30,3,0:4    0/1:0,3,40:4    0/1:0,0,0:3     1/1:34,3,0:4
>> 20      6738766 .       T       A       34.3    .       .       GT:PL:GQ
>> 0/1:0,0,0:3     0/0:0,3,33:4    0/1:0,0,0:3     1/1:69,6,0:4
>> 20      6739398 .       G       A       64      .       .       GT:PL:GQ
>> 1/1:0,0,0:3     1/1:0,0,0:3     1/1:0,0,0:3     1/1:96,9,0:10
>> 20      6740366 .       C       T       25.8    .       .       GT:PL:GQ
>> 0/1:0,0,0:3     0/1:0,0,0:3     0/1:0,0,0:3     1/1:57,6,0:6
>> 20      6740850 .       G       A       34.4    .       .       GT:PL:GQ
>> 0/1:0,0,0:3     0/0:0,6,59:6    0/1:0,0,0:3     1/1:70,6,0:3
>> 20      6743016 .       T       C       87.2    .       .       GT:PL:GQ
>> 0/1:0,0,0:3     0/1:0,3,31:3    0/1:0,0,0:3     1/1:124,12,0:10
>> 20      6743306 .       A       C       39.8    .       .       GT:PL:GQ
>> 0/1:0,0,0:3     1/1:71,6,0:6    0/1:0,0,0:3     0/1:0,0,0:3
>> 20      6746498 .       C       T       17.4    .       .       GT:PL:GQ
>> 0/1:0,0,0:3     0/0:0,3,38:4    0/1:31,3,0:4    0/1:24,0,54:26
>> 20      6749158 .       C       A       18.3    .       .       GT:PL:GQ
>> 0/0:0,3,29:8    0/0:0,3,32:8    0/1:53,0,30:40  0/0:0,21,159:25
>> 20      6749671 .       A       C       21.3    .       .       GT:PL:GQ
>> 0/0:0,9,65:7    0/1:33,3,0:3    0/1:28,3,0:3    0/1:0,0,0:3
>> 20      6751034 .       A       G       999     .       .       GT:PL:GQ
>> 0/0:0,24,189:19 0/1:33,0,141:38 1/1:255,105,0:99        1/1:255,66,0:65
>> 20      6751316 .       A       G       155     .       .       GT:PL:GQ
>> 0/0:0,3,22:4    0/0:0,6,43:6    1/1:116,12,0:8  0/1:84,0,25:29
>> 20      6754246 .       G       A       16.4    .       .       GT:PL:GQ
>> 0/0:0,0,0:3     0/0:0,3,20:6    0/0:0,0,0:3     0/1:48,0,43:45
>> 20      6755598 .       T       G       46      .       .       GT:PL:GQ
>> 1/1:0,0,0:3     1/1:0,0,0:3     1/1:0,0,0:3     1/1:78,9,0:10
>> 20      6756217 .       G       A       14.2    .       .       GT:PL:GQ
>> 0/0:0,3,38:7    0/0:0,3,38:7    0/0:0,0,0:4     0/1:47,0,26:34
>> 20      6760431 .       C       A       36.8    .       .       GT:PL:GQ
>> 0/1:0,0,0:3     0/1:0,0,0:3     0/1:0,0,0:3     1/1:68,6,0:6
>> 20      6761512 .       C       T       104     .       .       GT:PL:GQ
>> 1/1:0,0,0:3     1/1:0,0,0:3     1/1:0,0,0:3     1/1:136,12,0:13
>> 20      6762025 .       G       A       29.3    .       .       GT:PL:GQ
>> 0/1:0,3,37:4    1/1:32,3,0:4    0/1:0,0,0:3     1/1:35,3,0:4
>> 20      6765841 .       A       C       35.3    .       .       GT:PL:GQ
>> 0/0:0,3,31:4    0/1:0,0,0:3     0/1:0,0,0:3     1/1:70,6,0:4
>> 20      6767119 .       G       C       104     .       .       GT:PL:GQ
>> 1/1:0,0,0:3     1/1:0,0,0:3     1/1:0,0,0:3     1/1:136,12,0:13
>> 20      6767354 .       C       T       24      .       .       GT:PL:GQ
>> 0/1:0,0,0:3     0/1:0,0,0:3     0/1:0,0,0:3     0/1:54,0,111:55
>> 20      6767543 .       T       C       14.2    .       .       GT:PL:GQ
>> 0/0:0,3,31:7    0/0:0,3,32:7    0/0:0,0,0:4     0/1:47,0,22:30
>> 20      6769102 .       T       TC      117     .       .       GT:PL:GQ
>> 1/1:0,0,0:6     1/1:40,3,0:9    1/1:40,3,0:9    1/1:80,6,0:11
>> 20      6769533 .       G       A       21.4    .       .       GT:PL:GQ
>> 0/1:0,0,0:3     0/0:0,6,64:6    0/1:0,0,0:3     1/1:57,6,0:3
>> 20      6769676 .       A       G       27.2    .       .       GT:PL:GQ
>> 0/0:0,3,32:5    0/0:0,3,34:5    0/0:0,0,0:3     0/1:64,6,0:3
>> 20      6769714 .       T       C       63.2    .       .       GT:PL:GQ
>> 1/1:68,6,0:9    1/1:0,0,0:4     1/1:0,0,0:4     1/1:29,3,0:7
>> 20      6769877 .       T       C       14.5    .       .       GT:PL:GQ
>> 0/1:27,0,27:27  0/1:0,0,0:3     0/0:0,6,68:6    0/1:26,3,0:4
>> 20      6769893 .       C       A       16.7    .       .       GT:PL:GQ
>> 0/0:0,3,38:5    0/0:0,0,0:3     0/0:0,6,63:8    0/1:54,6,0:4
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



More information about the Bioconductor mailing list