[BioC] GenomicRangesUseCases Manual :: Typo?

Marc Carlson mcarlson at fhcrc.org
Thu Apr 14 21:16:10 CEST 2011


Hi Paul,

So the code you list from the manual should actually be perfectly fine. 
You really can subset (single bracket) an AnnDbBimap object by name like 
this.

Here is a concrete example:

library(org.Sc.sgd.db)
## grab some arbitrary keys just to do an example
systNames <- head(mappedLkeys(org.Sc.sgdCHRLOCEND))
## then subset
toTable(org.Sc.sgdGENENAME[systNames])


And in fact, you really want to do it that way when possible because it 
will be more efficient than the approach that you have proposed.

In your 1st example below the manual showed how to subset the AnnDbBimap 
object itself which means that it does not have to retrieve *everything* 
for this mapping from the underlying database: only the stuff that you 
list in "systNames".  IOW, the subset on the AnnDbBimap defines a 
narrower DB query that only gets made into a small data.frame when you 
subsequently call toTable() on it.

But in the 2nd example you pull the entire thing into memory right at 
the start as a big gigantic data.frame using toTable(), and then you 
pare that down to the size you actually want.  That is kind of a waste 
of memory.  For most things it won't be an issue, but for some of the GO 
mappings you might start to really care.  Also, unless your systNames 
are actually integers (which will be interpreted as indices and NOT as 
names), for the 2nd example to actually work I expect that you will have 
to do something more like this:

toTable(org.Sc.sgdGENENAME)[systNames %in% 
listOfAllNamesInTheFrameProducedBytoTable,]

So that  way of doing things not only involves more processing, but more 
typing as well.

Does this clarify things?


   Marc


On 04/13/2011 04:09 AM, KORIR, PAUL wrote:
> Hi,
>
> Question to the GenomicRanges maintainer:
>
> In the GenomicRanges Use Cases manual, page 9, 7th line from the top of the page is the R line:
>> toTable(org.Sc.sgdGENENAME[systNames])
> I found that the following line worked.
>> toTable(org.Sc.sgdGENENAME)[systNames,]
> Is that correct? If so please amend it in the manual.
>
> P. Kibet Korir
> Bioinformatics PhD Student,
> NUI Galway
> +353 86 224 19 66
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list