[BioC] Complementing alleles in a VCF

Alex Gutteridge alexg at ruggedtextile.com
Thu Apr 19 16:53:06 CEST 2012


On 19.04.2012 14:34, Martin Morgan wrote:
> Hi Alex --
>
> On 04/19/2012 02:39 AM, Alex Gutteridge wrote:
>> Hi All,
>>
>> I'm having some difficulty with VCF handling and feeding 1000 genome
>> VCFs into predictCoding from the VariantAnnotation package. Can 
>> anyone
>> help?
>>

[...]

>
> as an example
>
>   fl <- system.file("extdata", "ex2.vcf", 
> package="VariantAnnotation")
>   vcf <- readVcf(fl, "hg19")
>
> you can set the strand on vcf with
>
>> strand(rowData(vcf))
> 'factor' Rle of length 5 with 1 run
>   Lengths: 5
>   Values : *
> Levels(3): + - *
>> strand(rowData(vcf)) <- "+"
>> strand(rowData(vcf))
> 'factor' Rle of length 5 with 1 run
>   Lengths: 5
>   Values : +
> Levels(3): + - *
>
> (probably there should be strand,VCF, and strand<-,VCF,ANY methods).
>
> For alt, it seems like DNAStringSetList needs a constructor
>
> DNAStringSetList <-
>     function(...)
> {
>     listData <- list(...)
>     if (length(listData) == 1 && is.list(listData[[1L]]))
>         listData <- listData[[1L]]
>     ok <- sapply(listData, is, "DNAStringSet")
>     if (!all(ok))
>         listData <- lapply(listData, as, "DNAStringSet")
>     IRanges:::newCompressedList("DNAStringSetList", listData)
> }
>
> and then
>
>> orig <- values(alt(vcf))[["ALT"]]
>> dna <- complement(unlist(orig))
>> alt(vcf) <- relist(dna, orig)
>
> (It's confusing, though convenient, how alt() returns a GRanges, but
> alt<- expects 'value' to be a DNAStringSetList; also as you note the
> docs say there should be a VCF,DNAStringSet method for alt<-, but
> there doesn't seem to be one (it wouldn't be appropriate in your 
> case,
> because ALT is a DNAStringSetList).
>
> Martin

Thanks Martin - very helpful. In the end I ran predictCoding after 
ripping the Granges and varAllele information out of the vcf separately. 
This works, but handling the VCF directly would seem more elegant:

> predictCoding(rowData(vcf), txdb, Hsapiens, 
> complement(unlist(values(alt(vcf.filt))[["ALT"]])))

-- 
Alex Gutteridge



More information about the Bioconductor mailing list