[R] Splitting vector

arun smartpink111 at yahoo.com
Thu Apr 18 23:31:56 CEST 2013


Hi,
Try:
vec1<- "mue#d/sjbijk at ruepvnvbnceiicrpgxkgcyl@keduhqvqi/ubudvxopddpfddgitrynzshzdcwgneyffrkpbxwilwqngrsals#geqmtkcpkp/qecgdfa#uag" 

library(seqinr)
 res<-lapply(0:4,function(i) lapply(2:5,function(j) splitseq(s2c(gsub("[#@/]","",vec1)),word=j,frame=i)))
#or
library(stringr)
res1<-lapply(0:4,function(i) lapply(2:5,function(j) splitseq(s2c(str_replace_all(vec1,"[[:punct:]]","")),word=j,frame=i)))
#or
res2<-lapply(0:4,function(i) lapply(2:5,function(j) splitseq(s2c(gsub("[^[:alnum:]]","",vec1)),word=j,frame=i)))
identical(res,res1)
#[1] TRUE
 identical(res,res2)
#[1] TRUE
res[[1]][[1]]
# [1] "mu" "ed" "sj" "bi" "jk" "ru" "ep" "vn" "vb" "nc" "ei" "ic" "rp" "gx" "kg"
#[16] "cy" "lk" "ed" "uh" "qv" "qi" "ub" "ud" "vx" "op" "dd" "pf" "dd" "gi" "tr"
#[31] "yn" "zs" "hz" "dc" "wg" "ne" "yf" "fr" "kp" "bx" "wi" "lw" "qn" "gr" "sa"
#[46] "ls" "ge" "qm" "tk" "cp" "kp" "qe" "cg" "df" "au" "ag"
res[[1]][[2]]
 #[1] "mue" "dsj" "bij" "kru" "epv" "nvb" "nce" "iic" "rpg" "xkg" "cyl" "ked"
#[13] "uhq" "vqi" "ubu" "dvx" "opd" "dpf" "ddg" "itr" "ynz" "shz" "dcw" "gne"
#[25] "yff" "rkp" "bxw" "ilw" "qng" "rsa" "lsg" "eqm" "tkc" "pkp" "qec" "gdf"
#[37] "aua"
res[[5]][[4]]
# [1] "sjbij" "kruep" "vnvbn" "ceiic" "rpgxk" "gcylk" "eduhq" "vqiub" "udvxo"
#[10] "pddpf" "ddgit" "rynzs" "hzdcw" "gneyf" "frkpb" "xwilw" "qngrs" "alsge"
#[19] "qmtkc" "pkpqe" "cgdfa"

 A.K.

>I need to split  a vector into 2, 3, 4 and 5 letter words.  Characters like "#", "@","/" should be excluded. 
>"mue#d/sjbijk at ruepvnvbnceiicrpgxkgcyl@keduhqvqi/ubudvxopddpfddgitrynzshzdcwgneyffrkpbxwilwqngrsals#geqmtkcpkp/qecgdfa#uag" 
>
>For example: 
>2 letter split: "mu", "ed",... 
>3 letter split: "mue","dsj",... 
>4 letter split: "mued",... 
>.............................. 
>Another split doing the same from the second letter 
>2 letter split: "ue", "ds",.... 
>3 letter split: "ued",'... 
>---------------------------- 
>5 letter split: "uedsj",... 
>--------------------------------------------- 
>Continue the same process upto the fifth letter "s". 
>Thanks.



More information about the R-help mailing list