[BioC] chromosome name match among vcf, txdb,BSgenome

Kasper Daniel Hansen kasperdanielhansen at gmail.com
Thu Oct 4 23:40:55 CEST 2012


This sounds awesome.

For this, it may be worthwhile to be able to specify it universally,
through an option or something.  I expect most of us will choose one
style and stick to it.

I hope style will also imply an ordering chr1 < ... <  chr10

On Thu, Oct 4, 2012 at 4:18 PM, Hervé Pagès <hpages at fhcrc.org> wrote:
> Hi Rebecca,
>
>
> On 10/04/2012 12:10 PM, sun wrote:
>>
>> Hi All,
>>
>> I am going to use "coding <- predictCoding(vcf, txdb,
>> seqSource=Athaliana)"
>> to detect coding SNPs. The problem is that the chromosome names are not
>> consistent among VCF, txdb and BSgenome. In vcf, the chromosome name is
>> "Chr*", in txdb, the chr name is "Chr", but in BSgenome, the chr name is
>> "chr*" .
>>
>> I know I can use renameSeqlevels() to adjust the seqlevels (chromosome
>> names) of the VCF object to match that of the txdb annotation. But how can
>> I adjust the chr name of BSgenome or TranscriptDB?
>
>
> In BioC 2.11 (released yesterday), you can rename the chromosomes of a
> TranscriptDb object, so you could rename the chromosomes of your
> VCF and TranscriptDb objects to match the names of the BSgenome object.
>
> E.g. for the TranscriptDb object:
>
>   seqlevels(txdb) <- sub("^c", "C", seqlevels(txdb))
>
> Note that renaming the chromosomes of a TranscriptDb object is a new
> feature and is not fully implemented yet. For example, if you use
> select() on the object you'll still get the original names (those
> stored in the db), and if you try to specify a chromosome name thru
> the 'vals' arg of the transcripts(), exons() and cds() extractors,
> you still need to use the original names. This will be addressed soon.
>
> Our plan is to also support renaming of the chromosomes of BSgenome
> and SNPlocs objects very soon.
>
> Also, an additional level of convenience will be provided via the
> seqnameStyle() getter and setter, so you'll be able to quickly rename
> with something like:
>
>   seqnameStyle(x) <- "UCSC"
>
> or
>
>   seqnameStyle(vcf) <- seqnameStyle(txdb) <- seqnameStyle(genome)
>
> This will work on almost any 'x' object that contains chromosome
> names (GRanges, GRangesList, GappedAlignments, TranscriptDb, VCF,
> BSgenome, SNPlocs, etc...)
>
> Cheers,
> H.
>
>
>
>>
>> Thanks,
>>
>> Rebecca
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fhcrc.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list