[BioC] Where to get BAM files for easyRNASeq human use case ALSO ANNOTATION

Nicolas Delhomme delhomme at embl.de
Tue Aug 28 16:04:16 CEST 2012


Dear Richard,

I've implemented the SummarizedExperiment support in easyRNASeq version 1.3.14 - to be available in a couple of days from Bioc. If you set the outputFormat to "SummarizedExperiment" you'll get an object that contains the annotation used by easyRNASeq in its rowData slot. I've added this to the vignette, see section 6. To makes things easier, I've created a 'count' function that does the same as the 'easyRNASeq' one, but where the SummarizedExperiment is the default output. I plan to have this function supersede 'easyRNASeq' but as you'll be warned it will be subjected to many changes in the future, so don't rely on it yet in your production code.

One foreseen extension eased by the use of SummarizedExperiment is to fetch additional annotation using either biomaRt and/or any of the "org" package. You've had some discussion about it in this email thread and if you come up with a solution, let me know, as I could easily integrate it in the package. It's always easier to do such things when one has an example at hand.

In addition, I've extended the human use-case to retrieve and align reads using a variety of Bioc packages. Sadly some of them are only available for the unix platform. I'd be really interested in your feedback; that's in section 7 of the updated vignette.

Best,

Nico


---------------------------------------------------------------
Nicolas Delhomme

Genome Biology Computational Support

European Molecular Biology Laboratory

Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
---------------------------------------------------------------





On Aug 16, 2012, at 8:12 PM, Richard Friedman wrote:

> Martin,
> 
> 	Thanks!
> 	That will get me started!
> 
> Best wishes,
> Rich
> 
> 
> On Aug 16, 2012, at 1:34 PM, Martin Morgan wrote:
> 
>> 
>> Not so much an already worked out protocol but an elaboration of Steve's bet
>> 
>> An AnnotateSeq package would be a useful addition; the info in annaffy is in the org packages, discoverable with 'cols', 'keytypes' (often synonymous with 'cols'), and accessible via 'select'. The plans for the next release are OrganismDb objects that make the merge that one would do across, say, org*, TxDb*, and GO.db packages transparent.
>> 
>>> library(org.Dm.eg.db)
>>> cols(org.Dm.eg.db)
>> [1] "ENTREZID"     "ACCNUM"       "ALIAS"        "CHR" "CHRLOC"
>> [6] "CHRLOCEND"    "ENZYME"       "MAP"          "PATH"         "PMID" 
>> [11] "REFSEQ"       "SYMBOL"       "UNIGENE"      "ENSEMBL" "ENSEMBLPROT"
>> [16] "ENSEMBLTRANS" "GENENAME"     "UNIPROT"      "GO" "EVIDENCE"
>> [21] "ONTOLOGY"     "FLYBASE"      "FLYBASECG"    "FLYBASEPROT"
>>> select(org.Dm.eg.db, "FBtr0005009", c("GENENAME", "SYMBOL"), "ENSEMBLTRANS")
>> ENSEMBLTRANS          GENENAME SYMBOL
>> 1  FBtr0005009 Muscle protein 20   Mp20
>> 
>> Martin
>> 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list