[R] StartsWith over vector of Strings?

Greg Snow Greg.Snow at imail.org
Tue Jul 13 20:04:20 CEST 2010


My solution was based on using vectors (which were your original example), now you are using data frames.  The actual result is NA, then you just print content again (which my code never modified) so you are going to see the full content data frame.

Try:

content[na.omit(pmatch(searchset$signatures, content$urls)),]

then look at all the pieces (starting from inside out) to see what is happening at each step to understand what is going on.


-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111


> -----Original Message-----
> From: Ralf B [mailto:ralf.bierig at gmail.com]
> Sent: Tuesday, July 13, 2010 11:57 AM
> To: Greg Snow
> Cc: r-help at r-project.org
> Subject: Re: [R] StartsWith over vector of Strings?
> 
> When running the combined code with your suggested line:
> 
> content <- data.frame(urls=c(
> 
> 	"http://www.google.com/search?source=ig&hl=en&rlz=&=&q=stuff&aq=f
> &aqi=g10&aql=&oq=&gs_rfai=CrrIS3VU8TJqcMJHuzASm9qyBBgAAAKoEBU_QsmVh",
> 
> 	"http://search.yahoo.com/search;_ylt=Atvki9MVpnxuEcPmXLEWgMqbvZx4
> ?p=stuff&toggle=1&cop=mss&ei=UTF-8&fr=yfp-t-701")
> )
> searchset <- data.frame(signatures=c("http://www.google.com/search"))
> content[na.omit(pmatch(searchset, content$urls))]
> print(content)
> 
> I am getting both URLs as results, but in fact, would expect only the
> first URL. Am I overlooking something?
> 
> 
> Ralf
> 
> On Tue, Jul 13, 2010 at 12:03 PM, Greg Snow <Greg.Snow at imail.org>
> wrote:
> > content[na.omit(pmatch(searchset, content,,TRUE))]
> >
> > --
> > Gregory (Greg) L. Snow Ph.D.
> > Statistical Data Center
> > Intermountain Healthcare
> > greg.snow at imail.org
> > 801.408.8111
> >
> >
> >> -----Original Message-----
> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> >> project.org] On Behalf Of Ralf B
> >> Sent: Tuesday, July 13, 2010 5:47 AM
> >> To: r-help at r-project.org
> >> Subject: [R] StartsWith over vector of Strings?
> >>
> >> Given vectors of strings of arbitrary length
> >>
> >> content <- c("abc", "def")
> >> searchset <- c("a", "abc", "abcdef", "d", "def", "defghi")
> >>
> >> Is it possible to determine the content String set that matches the
> >> searchset in the sense of 'startswith' ? This would be a vector of
> all
> >> strings in content that start with the string of any of the strings
> in
> >> the searchset. In the little example here, this would be:
> >>
> >> result <- c("abc", "abc", "def", "def")
> >>
> >> Best,
> >> Ralf
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-
> >> guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >



More information about the R-help mailing list