[BioC] hgu133aPMID redundancy

Lynn Young lynny at mail.nih.gov
Sat Aug 28 00:58:07 CEST 2004


Dear Bioconductor group:

A few minutes ago, we downloaded and installed the annotation package 
for hgu133a to ensure that we have the latest version.  Thank you for 
providing the PubMed identifiers.

We notice a redundancy for the following example:

 > library(annotate)
 > library("hgu133a")
 > get("216572_at", env=hgu133aPMID)
[1] "95045392" "20530220" "11078474" "7957066"


When we go to PubMed, and type in the above identifiers, we find that 
95045392 is the same document as 7957066, and 20530220 is the same 
document as 11078474.  If we choose the display format as XML, the tag 
<ArticleIDList> shows that the first two identifiers above are medline 
IDs and the last two are PubMed IDs. 

As we would like to further analyze documents in batch mode, could you 
kindly look at the possibility of removing this redundancy?

Best regards,
Lynn Young



More information about the Bioconductor mailing list