[BioC] seqlevels in VCF objects

Paul Leo p.leo at uq.edu.au
Tue Jan 15 01:09:57 CET 2013


There is a discussion of something similar  in the 
"GenomicRanges Use Cases.pdf" (in the genomicRanges library) see section
3.1 of the pdf:

You can change the seqleveles directly:

 seqlevels(exonRanges) <- newlvls # reorder the levels
names(newlvls) <- seqlevels(aligns)



Dr Paul Leo
Senior Bioinformatician
The University of Queensland Diamantina Institute
---------------------------------------------------------------------
TRI, level  ,  37 Kent Street,  Woolloongabba QLD 4102
Tel: +61 7 3443 7072  Mob: 041 303 8691  Fax: +61 7 3443 6966 

-----Original Message-----
From: Murat Tasan <mmuurr at gmail.com>
To: bioconductor at r-project.org
Subject: [BioC] seqlevels in VCF objects
Date: Mon, 14 Jan 2013 18:40:46 -0500

hi all - i've encountered an annoying problem, and i'd like to avoid
read/writing the many GBs required for the blunt-force solution...

the 1000 Genomes project provides a collection of VCF files providing
the genotypes for all found variants.
after reading in the VCF files (via vcf <- readVcf(...)), i have an
VCF object, but the info(vcf) object reveals the chromosome names
(i.e. 'seqlevels') are "1", "2", ..., "X", "Y".
Bioconductor's TxDb.Hsapiens.UCSC.hg19.knownGene object, however, uses
the UCSC standard prefix for chromosome names: "chr1", "chr2", etc.

in trying to subset(...) or predictCoding(...) the VCF data against
the genome objects (including BSgenome.Hsapiens.UCSC.hg19) this causes
an obvious failure.

i tried re-setting the seqlevels of the VCF 'info' object like so
(thinking the seqnames factor just indexes back on the seqlevels as a
key):

seqlevels(vcf at info) <- sprintf("chr%s", seqlevels(vcf at info))

but this doesn't seem to have any effect.

any idea on how to make this bulk change of seqnames for data in VCF objects?

cheers,

-m

_______________________________________________
Bioconductor mailing list
Bioconductor at r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list