[R] regex - extracting 2 numbers and " from strings

David Winsemius dwinsemius at comcast.net
Fri Oct 9 03:07:37 CEST 2015


On Oct 8, 2015, at 4:50 PM, Omar André Gonzáles Díaz wrote:

> David, it does work but not in all cases:

It should work if you change the "+" to  "*" in the last capture class. It makes trailing non-digit characters entirely optional.

> sub("(^.+ )(\\d+)([\"]|[']{2})(.*$)", "\\2\\3", b)
 [1] "40''" "40''" "49\"" "49\"" "28\"" "40\"" "32''" "32''" "40\"" "55\""
[11] "40\"" "24\"" "42''" "50\"" "48\"" "48\"" "48\"" "48''" "50\"" "50''"
[21] "50\"" "55\"" "55''" "55\"" "55''" "55\"" "65''" "65\"" "65''" "75\""


Moral of the story: Always post an example with the necessary complexity.
> 
> This is now my b vector, after your solution:
> 
> b <- c("40''", "40''", "49\"", "49\"", "HAIER TELEVISOR LED LE28F6600 28\"", 
> "40\"", "32''", "32''", "40\"", "55\"", "HAIER TV LED LE40B8000 FULL HD 40\"", 
> "24\"", "42''", "HAIER TELEVISOR LED LE50K5000N 50\"", "48\"", 
> "48\"", "48\"", "48''", "50\"", "50''", "50\"", "55\"", "55''", 
> "55\"", "55''", "55\"", "65''", "SAMSUNG SMART TV 65JU6500 LED UHD 65\"", 
> "65''", "75\"")
> 
> 2015-10-08 18:14 GMT-05:00 David Winsemius <dwinsemius at comcast.net>:
> 
> On Oct 8, 2015, at 3:45 PM, Omar André Gonzáles Díaz wrote:
> 
> > Hi I have a vector of 100 elementos like this ones:
> >
> > a <- c("SMART TV LCD FHD 70\" LC70LE660", "LED FULL HD 58'' LE58D3140")
> >
> > I want to put just the (70\") and (58'') in a vector b.
> 
> > sub("(^.+ )(\\d+)([\"]|[']{2})(.+$)", "\\2\\3", a)
> [1] "70\"" "58''"
> 
> Also. The `stringr` package uses the code in the `stringi` package to give more compact expressions. You might want to look at
> 
> str_extract     Extract matching patterns from a string.
> str_extract_all Extract matching patterns from a string.
> 
> 
> >
> > This is my try, but is not working:
> >
> > b <- grepl('^[0-9]{2}""$',a)
> >
> > Any hint is welcome, thanks.
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius
> Alameda, CA, USA
> 
> 

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list