[BioC] BSgenome: dm3 and panTro2

Herve Pages hpages at fhcrc.org
Wed May 28 19:22:46 CEST 2008


Hi Joseph,

The source packages for dm3 (Fly) and panTro2 (Chimp) are now
available. I've also put dm2 back (used to be part of the
BSgenome family in previous versions of Bioconductor, but was
temporarily broken).
I can confirm now that the chromosomes sequences in dm3 are the
same as in FlyBase.r51. The exact set of sequences provided and
their exact names are a little bit different though:

    library(BSgenome.Dmelanogaster.FlyBase.r51)
    r51 <- BSgenome.Dmelanogaster.FlyBase.r51::Dmelanogaster
    library(BSgenome.Dmelanogaster.UCSC.dm3)
    dm3 <- BSgenome.Dmelanogaster.UCSC.dm3::Dmelanogaster

Then:

   > seqnames(r51)
    [1] "2L"                        "2R"
    [3] "3L"                        "3R"
    [5] "4"                         "X"
    [7] "U"                         "dmel_mitochondrion_genome"
    [9] "2LHet"                     "2RHet"
   [11] "3LHet"                     "3RHet"
   [13] "XHet"                      "YHet"

   > seqnames(dm3)
    [1] "chr2L"     "chr2R"     "chr3L"     "chr3R"     "chr4"      "chrX"
    [7] "chrU"      "chrM"      "chr2LHet"  "chr2RHet"  "chr3LHet"  "chr3RHet"
   [13] "chrXHet"   "chrYHet"   "chrUextra"

To compare chr2L, or chrM:

   > r51[["2L"]] == unmasked(dm3$chr2L)
   [1] TRUE

   > r51[["dmel_mitochondrion_genome"]] == unmasked(dm3$chrM)
   [1] TRUE

The binary versions of the packages for Windows and Mac will follow soon.

Cheers,
H.


Herve Pages wrote:
> Hi Joseph,
> 
> Are you sure that the dm3 assembly provided by UCSC (based on BDGP 
> Release 5)
> is different from the FlyBase r5.1 assembly? If not then you could just use
> the BSgenome.Dmelanogaster.FlyBase.r51 package which contains the 
> FlyBase r5.1
> assembly (I think that the differences between the various 5.y releases 
> from
> FlyBase are on the annotation side only, but the chromosome sequences 
> should
> be the same).
> 
> Anyway I've started building a BSgenome package for dm3. Once it's ready it
> will be easy to verify that the chromosome sequences are indeed the same 
> than
> in FlyBase r5.1 by doing something like:
> 
>   library(BSgenome.Dmelanogaster.FlyBase.r51)
>   r51 <- BSgenome.Dmelanogaster.FlyBase.r51::Dmelanogaster
>   library(BSgenome.Dmelanogaster.UCSC.dm3)
>   dm3 <- BSgenome.Dmelanogaster.UCSC.dm3::Dmelanogaster
>   r51$chr2L == unmasked(dm3$chr2L)
> 
> I'll take this opportunity to add the same built-in masks to this new 
> package
> than the ones I've already added to other BSgenome data packages (only 
> Human,
> Mouse and Dog so far). Those built-in masks are new in Bioconductor 2.2 and
> some examples on how to use them are shown in the GenomeSearching vignette
> (this vignette has been moved from the Biostrings pkg to the BSgenome pkg).
> 
> I will also make a BSgenome data pkg for Chimpanzee (with masks too) and 
> post
> here again when this is ready.
> 
> Cheers,
> H.
> 
> 
> joseph wrote:
>> Hi
>> Are there any plans to add the most recent Drosophila and Chimpanzee 
>> genomes to the BSgenome list?
>> The most recent UCSC versions are the Apr. 2006 assembly of the D. 
>> melanogaster genome (dm3) and the Chimpanzee Genome Mar. 2006 
>> (panTro2).  The Mac OS packages would be nice to have.
>> Thanks
>> Joseph
>>
>>
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list