[BioC] SNPlocs -> VCF or similar

Hervé Pagès hpages at fhcrc.org
Wed Jan 30 21:52:44 CET 2013


Hi Michael,

On 01/30/2013 11:21 AM, Michael Lawrence wrote:
> Hi,
>
> Is there any easy way to convert the output of getSNPlocs(), i.e., a
> GRanges with ambiguity codes, to something more like a VCF? Or would it be
> better to just access the dbSNP VCF file?
>
> I've made a function that does the above (shown below), but it would be
> nice to have a built-in path.
>
> stripRefSNPs <- function(x) {
>    x[x$alt != getSeq(Hsapiens, x, as.character = TRUE)]
> }
> explodeSNPAlleles <- function(x) {
>    alleles <- strsplit(IUPAC_CODE_MAP[x$alleles_as_ambig], NULL)
>    x <- x[rep(seq_len(length(x)), elementLengths(alleles))]
>    x$alleles_as_ambig <- NULL
>    x$alt <- unlist(alleles)
>    stripRefSNPs(x)
> }

A nice shortcut would be to be able to call the VCF() constructor
on the GRanges object returned by getSNPlocs():

   library(SNPlocs.Hsapiens.dbSNP.20120608)
   chr1_snps <- getSNPlocs("ch1", as.GRanges=TRUE)

   library(VariantAnnotation)
   chr1_vcf <- VCF(chr1_snps)

Actually it works:

   > chr1_vcf
   class: CollapsedVCF
   dim: 3517088 0
   rowData(vcf):
     GRanges with 2 metadata columns: RefSNP_id, alleles_as_ambig
info(vcf):
     DataFrame with 0 columns:
geno(vcf):
     SimpleList of length 0:

and seems to produce a valid VCF object, although I doubt this
object contains the information normally expected to be found in
VCF objects.

Note that the VCF() constructor also works on the "exploded" GRanges
object returned by your code:

   > exploded_chr1_snps <- explodeSNPAlleles(chr1_snps)
   > VCF(exploded_chr1_snps, collapsed=FALSE)
   class: ExpandedVCF
   dim: 3537354 0
   rowData(vcf):
     GRanges with 2 metadata columns: RefSNP_id, alt
   info(vcf):
     DataFrame with 0 columns:
   geno(vcf):
     SimpleList of length 0:

but like previously the information is probably not stored in the
expected way either.

Cheers,
H.

>
> Thanks,
> Michael
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list