[R] how to extract strings in any column and in any row that start with

Ana Marija @okov|c@@n@m@r|j@ @end|ng |rom gm@||@com
Sat May 16 00:28:32 CEST 2020


Hi Rui,

thank you so much that is exactly what I needed!

Cheers,
Ana

On Fri, May 15, 2020 at 5:12 PM Rui Barradas <ruipbarradas using sapo.pt> wrote:
>
> Hello,
>
> I have tried several options and with large dataframes this one was the
> fastest (in my tests, of the ones I have tried).
>
>
> s1 <- sapply(tot, function(x) grep('^E10', x, value = TRUE))
>
>
> Then unlist(s1).
> A close second (15% slower) was
>
>
> s2 <- tot[sapply(tot, function(x) grepl('^E10', x))]
>
>
> grep/unlist was 3.7 times slower:
>
>
> grep("^E10", unlist(tot), value = TRUE)
>
>
> Hope this helps,
>
> Rui Barradas
>
> Às 20:24 de 15/05/20, Ana Marija escreveu:
> > Hello,
> >
> > this command was running for more than 2 hours
> > grep("E10",tot,value=T)
> > and no output
> >
> > and this command
> > df1 <- tot %>% filter_all(any_vars(grepl( '^E10', .)))
> >
> > gave me a subset (a data frame) of tot where ^E10
> >
> > what I need is just a vector or all values in tot which start with E10.
> >
> > Thanks
> > Ana
> >
> > On Fri, May 15, 2020 at 12:13 PM Jeff Newmiller
> > <jdnewmil using dcn.davis.ca.us> wrote:
> >>
> >> Read about regular expressions... they are extremely useful.
> >>
> >> df1 <- tot %>% filter_all(any_vars(grepl( '^E10', .)))
> >>
> >> It is bad form not to put spaces around the <- assignment.
> >>
> >>
> >> On May 15, 2020 10:00:04 AM PDT, Ana Marija <sokovic.anamarija using gmail.com> wrote:
> >>> Hello,
> >>>
> >>> I have a data frame:
> >>>
> >>>> dim(tot)
> >>> [1] 502536   1093
> >>>
> >>> How would I extract from it all strings that start with E10?
> >>>
> >>> I know how to extract all rows that contain with E10
> >>> df0<-tot %>% filter_all(any_vars(. %in% c('E10')))
> >>>> dim(df0)
> >>> [1] 5105 1093
> >>>
> >>> but I just need a vector of strings that start with E10...
> >>> it would look something like this:
> >>>
> >>> [1] "E102" "E109" "E108" "E103" "E104" "E105" "E101" "E106" "E107"
> >>>
> >>> Thanks
> >>> Ana
> >>>
> >>> ______________________________________________
> >>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >> --
> >> Sent from my phone. Please excuse my brevity.
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >



More information about the R-help mailing list