[BioC] Does the strand of a microarray probe matter?
cei at ebi.ac.uk
Fri Nov 28 10:44:48 CET 2008
Thank you all for your answers, from these and some offline
conversations I had with people from the microarray facility, I can see
that current microarray protocols attempt to produce strand specific
samples before hybridizing (but see the ref. Wolfgang sent).
In this case, whenever doing probe mapping we have to be careful to
select only those probes with sequence matching on the appropriate
strand (and this will depend on the platform, since some manufacturers
report the probe sequence, and some the "target" sequence). As I
mentioned before, this has historically not always been the case.
One last point, regarding one of Sean's answers:
There is no attempt to map probes in bioconductor annotation packages
(at least those maintained by the core). The annotation from which the
annotation packages are derived come directly from the manufacturers,
Even if no re-mapping is being done (there are many bioC packages not
maintained by the core which do involve re-mapping), my main point was
that bioconductor annotation structures don't allow more than one "gene"
to be annotated for any particular probe. Do correct me if I'm wrong,
but at least when using AnnotationDbi I found no way of having more than
one gene (EntrezID) per probe.
Another example: Affymetrix does annotate more than one gene (EntrezID)
for their probes (~5% of probes in mouse430_2 with EntrezID have more
than one). So, I guess if the bioconductor core team is using the
manufacturer's annotation, then they are (in some way) removing this
# bit of R code showing this:
xx <- as.list(mouse4302ENTREZID)
any(lapply(xx, length) > 1)
And no, I'm not saying that different EntrezID's are always unrelated
genes, or that multiple probes mapping to multiple genes are always due
to strand problems.
More information about the Bioconductor