[R] Substring function?

Ralf B ralf.bierig at gmail.com
Tue Jul 13 14:22:17 CEST 2010


Hi all,

I would like to detect all strings in the vector 'content' that
contain the strings from the vector 'search'. Here a code example:

content <- data.frame(urls=c(
					"http://www.google.com/search?source=ig&hl=en&rlz=&=&q=stuff&aq=f&aqi=g10&aql=&oq=&gs_rfai=CrrIS3",
					"http://search.yahoo.com/search;_ylt=Atvki9MVpnxuEcPmXLEWgMqbvZx4?p=stuff&toggle=1")
)
search <- data.frame(signatures=c("http://www.google.com/search"))
subset(content, search$signatures %in% content$urls)

I am getting an error:

[1] urls
<0 rows> (or 0-length row.names)


What I would like to achieve is the return of
"http://www.google.com/search?source=ig&hl=en&rlz=&=&q=stuff&aq=f&aqi=g10&aql=&oq=&gs_rfai=CrrIS3".
Is that possible? In practice I would like to run this over 1000s of
strings in 'content' and 100s of strings in 'search'. Could I run into
performance issues with this approach and, if so, are there better
ways?

Best,
Ralf



More information about the R-help mailing list