[R] Spliting columns, strings or reg exp returning substrings

Henrique Dallazuanna wwwhsd at gmail.com
Fri Sep 25 16:22:43 CEST 2009


Try this:

DF <- data.frame(A = c('11_12', '22_23', '33_34'),
                 B = sample(3))

#1) Using strsplit
transform(DF, C = sapply(strsplit(as.character(DF$A), "_"), '[', 1))

#2) Using substr
transform(DF, C = substr(DF$A, 1, 2))

#3) Using regex
transform(DF, C = gsub("_.*", "", DF$A))


On Fri, Sep 25, 2009 at 11:01 AM, Dry, Jonathan R
<Jonathan.Dry at astrazeneca.com> wrote:
> Currently as the first column in a data frame I have string values in the format xx_yy - I want to create a new column with just the substring xx (for each row in turn).  Three possible ways to do this might be (1) split the string by '_' using strsplit and paste the first of the resulting variables into a new column, but I have been unable to do this for each row of my data frame in turn (trying to use apply); (2) split the column into two based on '_', but I am not sure if this is possible; (3) use a regular expression to return the substring up to the '_', but I am unsure how to make a regular expression return the substring it matches to in R.
>
> Any ideas on all three counts would be gratefully recieved.
>
> --------------------------------------------------------------------------
> AstraZeneca UK Limited is a company incorporated in Engl...{{dropped:21}}
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O




More information about the R-help mailing list