[R] Remove superscripts from HTML objects

S Ellison S.Ellison at LGCGroup.com
Fri Apr 13 14:42:23 CEST 2012


> h <- "<html><p>Cat<sup>a</sup></p><p>Dog</p></html>"
> sub("<sup.*sup>","",h)

Probably safer to do  

gsub("<sup.*?sup>","",h)

to avoid replacing multiple superscripts.

eg 
h2 <- "<html><p>Cat<sup>a</sup></p><p>Dog</p><p>Mouse<sup>a</sup></p><p>Raccoon</p></html>"
sub("<sup.*sup>","",h2)                 #drops everything between first <sup and last sup>
gsub("<sup.*?sup>","",h2)            #Drops each <sub>xxx</sup>


*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}



More information about the R-help mailing list