[R] help with text patterns in strings

William Dunlap wdunlap at tibco.com
Mon Jun 17 21:51:24 CEST 2013


> And is there a way to simultaneously tell R that, for example, “Friday” is
> the same as “Fri” or “F”; “Saturday” is the same as “Sat” or “Sa”; etc.?

Look at pmatch (partial match):
  > dayNames <- c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")
  > dayNames[pmatch(c("F", "Fr", "Friday", "Tu", "Th", "T", "Lunes"), dayNames, duplicates.ok=TRUE)]
  [1] "Friday"   "Friday"   "Friday"   "Tuesday"  "Thursday" NA         NA
"T" could be either "Tuesday" or "Thursday" so it is mapped to NA, as are entries like "Lunes" that do not match at all.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of arun
> Sent: Monday, June 17, 2013 12:15 PM
> To: bcrombie
> Cc: R help 
> Subject: Re: [R] help with text patterns in strings
> 
> Hi,
> May be this helps:
> 
> dat1<-
> data.frame(Ans=c("Friday","Wednesday","Friday,Saturday,Sunday","Saturday","Sat,Sun",
> "Th,F,Sa"),stringsAsFactors=FALSE)
>  dat1
>                      Ans
> 1                 Friday
> 2              Wednesday
> 3 Friday,Saturday,Sunday
> 4               Saturday
> 5                Sat,Sun
> 6                Th,F,Sa
> 
> 
>  vec1<- c("Su","M","Tu","W","Th","F","Sa")
>  vec2<-unlist(strsplit(dat1$Ans,","))
> 
> vec2
> 
>  #[1] "Friday"    "Wednesday" "Friday"    "Saturday"  "Sunday"    "Saturday"
>  #[7] "Sat"       "Sun"       "Th"        "F"         "Sa"
> sapply(vec1,function(x) length(vec2[grep(x,vec2)]) )
> #Su  M Tu  W Th  F Sa
> # 2  0  0  1  1  3  4
> 
> A.K.
> 
> 
> ----- Original Message -----
> From: bcrombie <bcrombie at utk.edu>
> To: r-help at r-project.org
> Cc:
> Sent: Monday, June 17, 2013 1:59 PM
> Subject: [R] help with text patterns in strings
> 
> Let’s say I have a data set that includes a column of answers to a question
> “What days of the week are you most likely to eat steak?”.
> The answers provided are [1] “Friday”, [2] “Wednesday”, [3] “Friday,
> Saturday, Sunday", [4] "Saturday”, [5] “Sat, Sun”, [6] “Th, F, Sa”
> How can I tell R to count “Friday, Saturday, Sunday”, “Sat, Sun”, and “Th,
> F, Sa” as three separate entries for each unique observation?
> And is there a way to simultaneously tell R that, for example, “Friday” is
> the same as “Fri” or “F”; “Saturday” is the same as “Sat” or “Sa”; etc.?
> Thanks for your assistance.
> 
> 
> 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/help-with-text-patterns-in-
> strings-tp4669714.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


More information about the R-help mailing list