[R] R grep question

Bert Gunter bgunter@4567 @end|ng |rom gm@||@com
Fri May 28 17:16:45 CEST 2021


FWIW:

I think Jim makes an excellent point -- regex's really aren't the right
tool for this sort of thing (imho); matching is.

Note also that if one is willing to live with a logical response (better,
again imho), then the ifelse() can of course be dispensed with:

> CRC$MMR.gene<-CRC$gene.all %in% match_strings
> CRC$MMR.gene
[1]  TRUE FALSE  TRUE FALSE

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, May 27, 2021 at 8:35 PM Jim Lemon <drjimlemon using gmail.com> wrote:

> Hi Kai,
> You may find %in% easier than grep when multiple matches are needed:
>
> match_strings<-c("MLH1","MSH2")
> CRC<-data.frame(gene.all=c("MLH1","MSL1","MSH2","MCC3"))
> CRC$MMR.gene<-ifelse(CRC$gene.all %in% match_strings,"Yes","No")
>
> Composing your match strings before applying %in% may be more flexible
> if you have more than one selection to make.
>
> On Fri, May 28, 2021 at 1:57 AM Marc Schwartz via R-help
> <r-help using r-project.org> wrote:
> >
> > Hi,
> >
> > A quick clarification:
> >
> > The regular expression is a single quoted character vector, not a
> > character vector on either side of the | operator:
> >
> > "MLH1|MSH2"
> >
> > not:
> >
> > "MLH1"|"MSH2"
> >
> > The | is treated as a special character within the regular expression.
> > See ?regex.
> >
> > grep(), when value = FALSE, returns the index of the match within the
> > source vector, while when value = TRUE, returns the found character
> > entries themselves.
> >
> > Thus, you need to be sure that your ifelse() incantation is matching the
> > correct values.
> >
> > In the case of grepl(), it returns TRUE or FALSE, as Rui noted, thus:
> >
> >    CRC$MMR.gene <- ifelse(grepl("MLH1|MSH2",CRC$gene.all), "Yes", "No")
> >
> > should work.
> >
> > Regards,
> >
> > Marc Schwartz
> >
> >
> > Kai Yang via R-help wrote on 5/27/21 11:23 AM:
> > >   Hi Rui,thank you for your suggestion.
> > > but when I try the solution, I got message below:
> > >
> > > Error in "MLH1" | "MSH2" :   operations are possible only for numeric,
> logical or complex types
> > >
> > > does it mean, grepl can not work on character field?
> > > Thanks,Kai    On Thursday, May 27, 2021, 01:37:58 AM PDT, Rui Barradas
> <ruipbarradas using sapo.pt> wrote:
> > >
> > >   Hello,
> > >
> > > ifelse needs a logical condition, not the value. Try grepl.
> > >
> > >
> > > CRC$MMR.gene <- ifelse(grepl("MLH1"|"MSH2",CRC$gene.all), "Yes", "No")
> > >
> > >
> > > Hope this helps,
> > >
> > > Rui Barradas
> > >
> > > Às 05:29 de 27/05/21, Kai Yang via R-help escreveu:
> > >> Hi List,
> > >> I wrote the code to create a new variable:
> > >>
> CRC$MMR.gene<-ifelse(grep("MLH1"|"MSH2",CRC$gene.all,value=T),"Yes","No")
> > >>
> > >>
> > >> I need to create MMR.gene column in CRC data frame, ifgene.all column
> contenes MLH1 or MSH2, then the MMR.gene=Yes, if not,MMR.gene=No
> > >>
> > >> But, the code doesn't work for me. Can anyone tell how to fix the
> code?
> > >>
> > >> Thank you,
> > >>
> > >> Kai
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list