[BioC] makeTranscriptDbFromBiomart - yeast - 2micron plasmid missing

Hervé Pagès hpages at fhcrc.org
Wed Mar 6 09:45:59 CET 2013


Hi Stefanie,

Not a good idea to start a new thread by replying to an old one. This
confuses most thread-aware email client as they will place the new
question deep inside the old thread, and as a result most people won't
see it.

On 03/05/2013 01:34 AM, Stefanie Tauber wrote:
> Dear List,
>
> I am creating a TranscriptDatabase as follows:
>
> library(GenomicFeatures)
> myDB <- makeTranscriptDbFromBiomart(biomart = "ensembl", dataset = "scerevisiae_gene_ensembl", circ_seqs = c(DEFAULT_CIRC_SEQS, "Mito"))
> myDBx <- cdsBy(myDB, by = "tx", use.names = TRUE)
>
> everything fine so far,
> I am just missing 4 ORFs which are present on the 2 micron plasmid. (R0010W, R0020C, R0030W, R0040C)

There doesn't seem to be any 2-micron plasmid in the Yeast reference
genome currently in use by Ensembl:

   > seqlengths(myDB)
         I      II     III      IV       V      VI     VII    VIII 
IX       X
    230218  813184  316620 1531933  576874  270161 1090940  562643 
439888  745751
        XI     XII    XIII     XIV      XV     XVI    Mito
    666816 1078177  924431  784333 1091291  948066   85779

No 2-micron plasmid either in UCSC sacCer3:

   > library(BSgenome.Scerevisiae.UCSC.sacCer3)

   > Scerevisiae
   Yeast genome
   |
   | organism: Saccharomyces cerevisiae (Yeast)
   | provider: UCSC
   | provider version: sacCer3
   | release date: April 2011
   | release name: SGD April 2011 sequence
   |
   | sequences (see '?seqnames'):
   |   chrI     chrII    chrIII   chrIV    chrV     chrVI    chrVII 
chrVIII
   |   chrIX    chrX     chrXI    chrXII   chrXIII  chrXIV   chrXV 
chrXVI
   |   chrM
   |
   | (use the '$' or '[[' operator to access a given sequence)

   > seqlengths(Scerevisiae)
      chrI   chrII  chrIII   chrIV    chrV   chrVI  chrVII chrVIII 
chrIX    chrX
    230218  813184  316620 1531933  576874  270161 1090940  562643 
439888  745751
     chrXI  chrXII chrXIII  chrXIV   chrXV  chrXVI    chrM
    666816 1078177  924431  784333 1091291  948066   85779

The 2 Yeast genomes above (Ensembl and UCSC) seem to be the same even
though having the same chromosome lengths is not a guarantee that the
sequences are actually the same.

>
>
> If one puts here http://www.yeastgenome.org/cgi-bin/seqTools "R0010W",
> one finds the following info:
> FLP1/R0010W, ORF, on 2-micron plasmid from coordinates 252 to 1523.
>
>
> While the annotation from Ensembl is imported from SGD, no ORFs are listed for the 2-micron plasmid, and therefore
> also not accessible via makeTranscriptDbFromBiomart.
>
> Any hints what I am getting wrong?

I'm not sure why the 2-micron plasmid was dropped by Ensembl (and UCSC)
but that sounds more like a question for them.

H.


>
> Best,
> Stefanie
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list