[BioC] How do I parse HTML table using RCurl?

Ruppert Valentino ruppert7 at hotmail.com
Mon Mar 14 23:35:19 CET 2011


Hi James,
 
Many thanks for telling me that target scan is accessible via AnnotationDbi as this will help me to solve the problem in a different way as the others suggested.
 
Can you tell me if bioconductor has resource to access miRanda http://www.microrna.org/microrna/ and pictar http://pictar.mdc-berlin.de/cgi-bin/PicTar_vertebrate.cgi 
 
If so, which library can I use?
 
 
Many thanks
 
Ruppert
 


----------------------------------------
> Date: Mon, 14 Mar 2011 23:15:45 +0100
> From: james.reid at ifom-ieo-campus.it
> To: ruppert7 at hotmail.com
> CC: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] How do I parse HTML table using RCurl?
>
> Hi Ruppert,
>
> the targetscan database for Human and Mouse is already available in
> bioconductor as an AnnotationDbi annotation resource
> (targetscan.Hs.eg.db and targetscan.Mm.eg.db), so is mirbase but without
> any target predictions. As others have pointed out on the mailing list I
> would not recommend parsing the html of a query as the format is likely
> to change in time, but rather download the database and re-format.
> If you are interested in providing other miRNA target prediction
> resources to the community, I would be willing to help.
>
> Best,
> J.
>
>
> On 03/14/2011 09:18 PM, Ruppert Valentino wrote:
> >
> >
> > Hello,
> >
> > I am trying to write a script that will enter miRNA and get the predicted target genes for that miRNA. I am trying to use various software to do this, one of them is TargetScan. The problem is that I don't know how to parse the HTML output table so that I can get the target genes only.
> >
> > For example I am search for target genes for the miRNA mmu-miR-1 as follows:
> >
> > http://www.targetscan.org/cgi-bin/targetscan/vert_50/targetscan.cgi?species=Human&gid=&mir_sc=&mir_c=&mir_nc=&mirg=mmu-miR-1
> >
> > This generates a table
> >
> >
> >
> > The script is:
> >
> > URL<- "http://www.targetscan.org/cgi-bin/targetscan/vert_50/targetscan.cgi?species=Human&gid=&mir_sc=&mir_c=&mir_nc=&mirg=mmu-miR-1"
> > dat<- readLines(URL)
> >
> >
> > But I don't know how to parse the table to separate it into columns then I can take the column entitled "Human ortholog of target gene" which would have the target genes.
> >
> >
> > In the example above the first gene COL4A3 starts at HTML code:
> >
> > COL4A3
> >
> >
> >
> > Is there any way to format such a table into columns then transpose the column entitled "Human ortholog of target gene" and pass that to a variable?
> >
> >
> > Many thanks,
> >
> >
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> > 		 	   		  


More information about the Bioconductor mailing list