I looked at the alternative mappings a few months ago after attending a 
seminar given by Stanley Watson, Director of Mental Health Research 
Institute at University of Michigan. He recommended that the alternative 
mappings always be used because of the large discrepancies they found 
between Affymetrix's mapping and their mappings of the probes. I don't know 
whether they have any documentation on whether their mappings yield results 
that are more often validated through alternative methodologies or not, but 
they do have quite a lot of documentation on what they did and why they did 
it - see the description of custom CDF files and their new paper from links 
on the page Jim put in his first post. Even if Ensembl or Affymetrix 
updates their annotation based on remapping, the CDFs aren't changed, so 
the summarization and statistical analysis are done using probes that may 
not all map to the same "gene" uniquely. What these alternative mapping do 
is to remap each probe, then redefine probe sets based on all the probes 
that map to a "gene", and that it's these re-groupings that are most 
important.  Many of the alternative mappings are subsets of other ones, 
like taking only the first 11 probes from the 3' end in cases where there 
are more than 11 probes, so there are not quite as many alternative 
mappings as it first appears.

I do agree with Jim that coming up with a defensible rationale is 
important, as I was having trouble deciding which mapping might be the best 
to use. Stan Watson would argue that any of them are better than the 
outdated Affymetrix groupings. If Affy did theirs based on Unigene 
clustering, then the new mapping & grouping based on Unigene might be a 
defensible choice. In the end, I succumbed to historical inertia and went 
with Affymetrix's CDF, in part because I do analyses for many organisms, 
and MBNI only has alternative CDFs for human, mouse, and rat. However, I 
was able to get the alternative CDFs to work in Bioconductor with little 

As far as validating the genes on the magical "significant list", I did get 
some advice at a recent conference to ALWAYS first check the current probe 
mappings for those significant genes, then only concentrate on those that 
have most or all of their probes where they should be. Does anyone do this 
routinely? Should we, but we don't because it is too time consuming?


