[R] Pull Stock Symbol Out of String

Boris Steipe boris.steipe at utoronto.ca
Tue Apr 8 23:45:42 CEST 2014


You could try:

# Use ?regexec and ?regmatches to return a list of grouped matches.
# Use \\(  and \\) to match literal parentheses.
# Use ... to match three characters.
# Use $ to match at end of string.

s1 <- "American Tower Corporation (REIT)Â (AMT)"
s2 <- "Aetna Inc. (AET)"
getSym <- function(s) {regmatches(s, regexec("\\((...)\\)$", s))[[1]][2]}

getSym(s1) # [1] "AMT"
getSym(s2) # [1] "AET"

Cheers,
B.




On 2014-04-08, at 2:29 PM, Sparks, John James wrote:

> Dear R Helpers,
> 
> My regex skills are beginner to intermediate and banging around the web
> has not resulted in a solution to the problem below so I hope that one of
> you who has mad skills can help me out.
> 
> I want to extract the stock ticker--AMT-- out of the string
> 
> American Tower Corporation (REIT)Â (AMT)
> 
> The presence of the other parenthetical text (REIT) makes this difficult. 
> Please note that the string may or may not have a interfering set of
> characters such as the (REIT) so the solution needs to be generalizable to
> the last set of characters that are contained in parentheses in the larger
> string.  So an example of a string without the interfering (REIT) would be
> 
> Aetna Inc. (AET)
> 
> 
> Your assistance would be very much appreciated.
> 
> --John Sparks
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list