[R] is there a way to extract fata from web pages through some R function ?

Greg Hirson ghirson at ucdavis.edu
Wed Jul 1 17:41:42 CEST 2009


Maura,

Try the RCurl package, specifically the functions getURL and getForm.

Greg

mauede at alice.it wrote:
> I deal with a huge amount of Biology data stored in different databases.
> The databases belongig to Bioconductor organization can be accessed through Bioconductor packages.
> Unluckily some useful data is stored in databases like, for instance, miRDB, miRecords, etc ... which offer just an
> interactive HTML interface. See for instance
>  http://mirdb.org/cgi-bin/search.cgi, 
>  http://mirecords.umn.edu/miRecords/interactions.php?species=Homo+sapiens&mirna_acc=Any&targetgene_type=refseq_acc&targetgene_info=&v=yes&search_int=Search
>
> Downloading data manually from the web pages is a painstaking time-consumung and error-prone activity.
> I came across a Python script that downloads (dumps) whole web pages  into a text file that is then parsed.
> This is possible because Python has a library to access web pages.
> But I have no experience with Python programming nor I like such a programming language whose syntax is indentation-sensitive.
>
> I am *hoping* that there exists some sort of web pages, HTML connection  from R ... is there ??
>
> Thank you very much for any suggestion.
> Maura
>
>
>
> tutti i telefonini TIM!
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>   

-- 
Greg Hirson
ghirson at ucdavis.edu

Graduate Student
Agricultural and Environmental Chemistry

1106 Robert Mondavi Institute North
One Shields Avenue
Davis, CA 95616




More information about the R-help mailing list