[R] Extract Data from a Webpage

Chuck Cleland ccleland at optonline.net
Wed Dec 17 01:11:20 CET 2008

Hi All:
  I would like to extract the provider name, address, and phone number
from multiple webpages like this:


  Based on searching R-help archives, it seems like the XML package
might have something useful for this task.  I can load the XML package
and supply the url as an argument to htmlTreeParse(), but I don't know
how to go from there.


Chuck Cleland

> sessionInfo()
R version 2.8.0 Patched (2008-12-04 r47066)

LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] XML_1.98-1

Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

More information about the R-help mailing list