[BioC] IlluminaHumanMethylation450k.db Reference Versions
mcarlson at fhcrc.org
Fri Nov 11 01:58:51 CET 2011
The behavior of all of the mappings (with exceptions for those ones that
Tim has has previously "adjusted" such as CHR), will "hide" the probes
that match multiple entrez gene IDs. This happens because the 450k.db
was made as a chip package, and chip packages were specifically designed
to hide probes that behave that way by default. The reason for this
behavior is because chip packages were originally designed to work
primarily as mRNA microarrray platforms. So the default behavior is not
really broken, or even really inappropriate. It's just that this is an
atypical use case. But the data is all in there, and you absolutely CAN
get to it with really very little trouble. You just have to use the
toggleProbes method to expose it.
You can use it like this:
## step 1: create a mapping that exposes ALL the probes regardless of
how many genes the match:
## step 2: use that mapping instead of
## You can compare the two mappings to see how they behave differently:
I understand that Tim is planning to modify this package so that it's
default behavior is more in line with what users of this platform
expect, which is a terrific thing for him to do. But in the meantime,
the package is perfectly serviceable, you just have to know how to use
the toggleProbes method.
On 11/07/2011 07:09 AM, Tim Triche, Jr. wrote:
> CHR does what is expected of the mapping in that it returns the chromosome
> of the probe. It is constructed by overwriting the bimap for CHR with
> that for CHR37 on export. Without this kludge, tens of thousands of probes
> return NA as their chromosome, which is clearly incorrect.
> As it happens, due to a long-standing tradition of excluding 'promiscuous'
> probes, the default behavior of ALIAS2PROBE (for example) is also wrong.
> I'm about to upload 2.0.6 with that patched.
> The problem with gene-centric annotations of the sort used in Bioconductor
> .db packages is that they're gene-centric; the mapping from probes to
> genes, locations, chromosomes, GO annotations, KEGG pathways, and the like
> is done through EntrezGene IDs. There has been some discussion as to
> whether completely reannotating the chip might not be a better idea in this
> respect, i.e. mapping the probes to the nearest TSS. As I have gained more
> experience with the GRanges architecture, I have realized that GRanges are
> the more sensible approach to annotating the probes on the 450k.
> Nonetheless, the 450k.db package is out there so it ought to do what it's
> expected to, unless or until everything transitions to the manifest package
> that Kasper and Martin Aryee put together.
> On Sun, Nov 6, 2011 at 11:00 PM, Dario Strbenac<D.Strbenac at garvan.org.au>wrote:
>> In the package IlluminaHumanMethylation450k.db, there are three data
>> objects relating probes to chromosomes. They are
>> IlluminaHumanMethylation450kCHR, IlluminaHumanMethylation450kCHR36, and
>> IlluminaHumanMethylation450kCHR37. I wonder what the reason of having
>> IlluminaHumanMethylation450kCHR is, and what reference was used, since that
>> is not explained in the help page of IlluminaHumanMethylation450kCHR ? Is
>> it redundant ?
>> Also, the mapping to locations, IlluminaHumanMethylation450kCHRLOC, is
>> only available for hg19. There should also be one for hg18, or otherwise
>> the IlluminaHumanMethylation450kCHR36 should not be supported.
>> I am referring to version 1.4.6 of the IlluminaHumanMethylation450k.db
>> Dario Strbenac
>> Research Assistant
>> Cancer Epigenetics
>> Garvan Institute of Medical Research
>> Darlinghurst NSW 2010
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> Search the archives:
More information about the Bioconductor