[BioC] chromosome name match among vcf, txdb,BSgenome

Hervé Pagès hpages at fhcrc.org
Fri Oct 5 01:15:41 CEST 2012


On 10/04/2012 03:13 PM, Tim Triche, Jr. wrote:
> This is a terrific addition, thanks so much Herve for implementing it.

Glad you like it Tim. Thanks!  H.

>
>
> On Thu, Oct 4, 2012 at 1:18 PM, Hervé Pagès <hpages at fhcrc.org
> <mailto:hpages at fhcrc.org>> wrote:
>
>     Hi Rebecca,
>
>
>     On 10/04/2012 12:10 PM, sun wrote:
>
>         Hi All,
>
>         I am going to use "coding <- predictCoding(vcf, txdb,
>         seqSource=Athaliana)"
>         to detect coding SNPs. The problem is that the chromosome names
>         are not
>         consistent among VCF, txdb and BSgenome. In vcf, the chromosome
>         name is
>         "Chr*", in txdb, the chr name is "Chr", but in BSgenome, the chr
>         name is
>         "chr*" .
>
>         I know I can use renameSeqlevels() to adjust the seqlevels
>         (chromosome
>         names) of the VCF object to match that of the txdb annotation.
>         But how can
>         I adjust the chr name of BSgenome or TranscriptDB?
>
>
>     In BioC 2.11 (released yesterday), you can rename the chromosomes of a
>     TranscriptDb object, so you could rename the chromosomes of your
>     VCF and TranscriptDb objects to match the names of the BSgenome object.
>
>     E.g. for the TranscriptDb object:
>
>        seqlevels(txdb) <- sub("^c", "C", seqlevels(txdb))
>
>     Note that renaming the chromosomes of a TranscriptDb object is a new
>     feature and is not fully implemented yet. For example, if you use
>     select() on the object you'll still get the original names (those
>     stored in the db), and if you try to specify a chromosome name thru
>     the 'vals' arg of the transcripts(), exons() and cds() extractors,
>     you still need to use the original names. This will be addressed soon.
>
>     Our plan is to also support renaming of the chromosomes of BSgenome
>     and SNPlocs objects very soon.
>
>     Also, an additional level of convenience will be provided via the
>     seqnameStyle() getter and setter, so you'll be able to quickly rename
>     with something like:
>
>        seqnameStyle(x) <- "UCSC"
>
>     or
>
>        seqnameStyle(vcf) <- seqnameStyle(txdb) <- seqnameStyle(genome)
>
>     This will work on almost any 'x' object that contains chromosome
>     names (GRanges, GRangesList, GappedAlignments, TranscriptDb, VCF,
>     BSgenome, SNPlocs, etc...)
>
>     Cheers,
>     H.
>
>
>
>
>         Thanks,
>
>         Rebecca
>
>                  [[alternative HTML version deleted]]
>
>         _________________________________________________
>         Bioconductor mailing list
>         Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>         https://stat.ethz.ch/mailman/__listinfo/bioconductor
>         <https://stat.ethz.ch/mailman/listinfo/bioconductor>
>         Search the archives:
>         http://news.gmane.org/gmane.__science.biology.informatics.__conductor
>         <http://news.gmane.org/gmane.science.biology.informatics.conductor>
>
>
>     --
>     Hervé Pagès
>
>     Program in Computational Biology
>     Division of Public Health Sciences
>     Fred Hutchinson Cancer Research Center
>     1100 Fairview Ave. N, M1-B514
>     P.O. Box 19024
>     Seattle, WA 98109-1024
>
>     E-mail: hpages at fhcrc.org <mailto:hpages at fhcrc.org>
>     Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>     Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>
>
>     _________________________________________________
>     Bioconductor mailing list
>     Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>     https://stat.ethz.ch/mailman/__listinfo/bioconductor
>     <https://stat.ethz.ch/mailman/listinfo/bioconductor>
>     Search the archives:
>     http://news.gmane.org/gmane.__science.biology.informatics.__conductor <http://news.gmane.org/gmane.science.biology.informatics.conductor>
>
>
>
>
> --
> /A model is a lie that helps you see the truth./
> /
> /
> Howard Skipper
> <http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf>
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list