[R] split a character variable into several character variable by a character

Darren Norris doon75 at hotmail.com
Sat Apr 11 15:17:18 CEST 2009


just an alternative try gsub to perform the split by "-". 
It is not splitting but substituting everything before or after "-" with
nothing....

?gsub

from Francisco's example:
 
dat<-read.table("clipboard", header=T)#Read from your email
gsub("-.*","",dat$popcode)# gives the BCPy01 part of column popcode
gsub(".*-","",dat$popcode) # gives the 01 part of column popcode

then to add these vectors as columns to your dataframe:

dat$popcodeStart<-gsub("-.*","",dat$popcode)
dat$popcodeEnd<-gsub(".*-","",dat$popcode)

dat
     popcode     codetot   p3need varleft varright popcodeStart popcodeEnd
1  BCPy01-01 BCPy01-01-1 100.0000  BCPy01        1       BCPy01         01
2  BCPy01-01 BCPy01-01-2 100.0000  BCPy01        1       BCPy01         01
3  BCPy01-01 BCPy01-01-3 100.0000  BCPy01        1       BCPy01         01
4  BCPy01-02 BCPy01-02-1  92.5926  BCPy01        2       BCPy01         02
5  BCPy01-02 BCPy01-02-1 100.0000  BCPy01        2       BCPy01         02
6  BCPy01-02 BCPy01-02-2  92.5926  BCPy01        2       BCPy01         02
7  BCPy01-02 BCPy01-02-2 100.0000  BCPy01        2       BCPy01         02
8  BCPy01-02 BCPy01-02-3  92.5926  BCPy01        2       BCPy01         02
9  BCPy01-02 BCPy01-02-3 100.0000  BCPy01        2       BCPy01         02
10 BCPy01-03 BCPy01-03-1 100.0000  BCPy01        3       BCPy01         03

or with splitstr:
dat<-read.table("clipboard", header=T)#Read from your email
newdat<-do.call("rbind",strsplit(as.character(dat$popcode),"-")) #creates a
matrix with result of strsplit
colnames(newdat)<-c("popcodeStart","popcodeEnd") # add column names
newd<-data.frame(dat,newdat) # create new dataframe
newd
     popcode     codetot   p3need popcodeStart popcodeEnd
1  BCPy01-01 BCPy01-01-1 100.0000       BCPy01         01
2  BCPy01-01 BCPy01-01-2 100.0000       BCPy01         01
3  BCPy01-01 BCPy01-01-3 100.0000       BCPy01         01
4  BCPy01-02 BCPy01-02-1  92.5926       BCPy01         02
5  BCPy01-02 BCPy01-02-1 100.0000       BCPy01         02
6  BCPy01-02 BCPy01-02-2  92.5926       BCPy01         02
7  BCPy01-02 BCPy01-02-2 100.0000       BCPy01         02
8  BCPy01-02 BCPy01-02-3  92.5926       BCPy01         02
9  BCPy01-02 BCPy01-02-3 100.0000       BCPy01         02
10 BCPy01-03 BCPy01-03-1 100.0000       BCPy01         03


Hope that helps,
Darren



Hello Mao,



dat<-read.table("clipboard", header=T)#Read from your email
varleft<-substr(dat$popcode,0,6)
varright<-substr(dat$popcode,8,9)
datnew<-data.frame(dat,varleft,varright)

 > datnew

-- 
View this message in context: http://www.nabble.com/split-a-character-variable-into-several-character-variable-by-a-character-tp22989236p23000745.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list