[R] using grepl in dplyr

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Wed Nov 30 01:56:12 CET 2016


That is not a very selective regex.

Actually, a long "or" probably is best, but you don't have to type it in directly. 

prefixes <- c( "AD", "FN" )
pat <- paste0( "^(", paste( prefixes, collapse="|" ), ")[0-9]{4}$" )
grepl( pat, Identifier )

-- 
Sent from my phone. Please excuse my brevity.

On November 29, 2016 10:37:29 AM PST, Glenn Schultz <glennmschultz at me.com> wrote:
>Hello All,
>
>I have a dataframe of about 1.5 million rows from this dataframe I need
>to filter out identifiers.  An example would be 070000-07099,
>AD0000-AD0999, and AL0000-AL9999, FN0000-FN9999.  I am using grepl to
>identify those of interest as follows:
>
> grepl("^[FN]|[AD]{2}", Identifier)
>
>The above seems to work in the case of FN and AD.  However, there are
>20 such identifiers and there must be a better way to do this than a
>long "or" statement.  Ultimately, I would like to filter these out
>using dplyr which I think the first step is to create a vector of
>TRUE/FALSE then filter on TRUE
>
>Any Ideas are appreciated,
>Glenn
>
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list