[BioC] paper - download - pubmed

Chris Stubben stubben at lanl.gov
Tue Jan 22 18:40:04 CET 2013


   Actually what you told is working perfectly for the PMC ids, but not PM ids.
   Like if I need to get the PDFs for this PM ids : 10417722, what should I do?
   >From my institute, I'm allowed to download papers from various journals,
   and the problem is now, I can only get the papers annotated with PMC ids but
   not with PM ids.

   There are a few ways to get PMC ids from pubmed ids using E-utilities and
   the genomes package.
   # E-link  - for a list of links see
    subset( einfo("pubmed", links=TRUE), DbTo=="pmc")
   # dbfrom  =  pubmed by default.
   elink(14769935, dbto="pmc", cmd="neighbor", linkname="pubmed_pmc")
   [1] 357076   # = PMC357076
   # or if no PMC id available
    elink(10417722, dbto="pmc", cmd="neighbor", linkname="pubmed_pmc")
   numeric(0)
   # or use E-fetch and get the abstract - the PMCID is listed before the PMID
   and you could use grep to grab that.  Again pubmed is the default db
   efetch(14769935, rettype="abstract")
   [26] "PMCID:
   PMC357076"
   [27] "PMID: 14769935  [PubMed - indexed for MEDLINE]"
   # or get XML from efetch
   x <- efetch(14769935, retmode="xml")
   doc<-xmlParse(x)   # requires XML package
   xpathSApply(doc, '//ArticleId[@IdType="pmc"]', xmlValue)
   [1] "PMC357076"
   If the Pubmed Id is not linked to PMC, you could read the Pubmed results
   page and check if there is a link to a full text article from the publisher.
   url <- [1]"http://www.ncbi.nlm.nih.gov/pubmed/?term=10417722"
   doc <- xmlParse(url)
   ## the results page includes a namespace, so queries look awful
    xpathSApply(doc, '//x:div[@class="icons"]/x:div/x:a', xmlGetAttr, "href",
   namespaces = c("x" = [2]"http://www.w3.org/1999/xhtml"))
   [1]
   [3]"http://onlinelibrary.wiley.com/resolve/openurl?genre=article&sid=nlm:pub
   med&issn=0960-7412&date=1999&volume=19&issue=1&spage=9"
   You could read that link and find another link to download the pdf , which
   is probably different for each publisher...
   [4]http://onlinelibrary.wiley.com/doi/10.1046/j.1365-313X.1999.00491.x/pdf
   Chris

References

   1. http://www.ncbi.nlm.nih.gov/pubmed/?term=10417722
   2. http://www.w3.org/1999/xhtml
   3. http://onlinelibrary.wiley.com/resolve/openurl?genre=article&sid=nlm:pubmed&issn=0960-7412&date=1999&volume=19&issue=1&spage=9
   4. http://onlinelibrary.wiley.com/doi/10.1046/j.1365-313X.1999.00491.x/pdf


More information about the Bioconductor mailing list