[R] divide column in a dataframe based on a character

David Winsemius dwinsemius at comcast.net
Tue Oct 26 06:33:30 CEST 2010


On Oct 25, 2010, at 8:56 PM, Daisy Englert Duursma wrote:

> Hello,
>
> If I have a dataframe:
>
> example(data.frame)
> zz<- 
> c 
> ("aa_bb 
> ","bb_cc 
> ","cc_dd","dd_ee","ee_ff","ff_gg","gg_hh","ii_jj","jj_kk","kk_ll")
> ddd <- cbind(dd, group = zz)
>
> and I want to divide the column named group by the "_", how would I  
> do this?
>
> so instead of the first row being
> x   y  fac char  group
> 1  1   C    a     aa_bb
>
> it should be:
> x  y fac  char group_a    group_b
> 1  1   C    a      aa             bb
>
>
>
> I know for a vector I can:
> x1 <- c("a_b","b_c","c_d")
> do.call("rbind",strsplit(x1, "_"))
>
> but I am not sure how this relates to my data.frame

The group columns is a factor, as is the default structure for non- 
numeric character arguments to dataframe() and cbind.data.frame(). If  
you want to the split values you must first convert to character:

 > ddd$group_a <- lapply(strsplit(as.character(ddd$group), "_"), "[", 1)
 > ddd$group_b <- lapply(strsplit(as.character(ddd$group), "_"), "[", 2)
 > ddd
    x  y fac char group group_a group_b
1  1  1   C    a aa_bb    aa     bb
2  1  2   B    b bb_cc    bb     cc
3  1  3   C    c cc_dd    cc     dd
4  1  4   C    d dd_ee    dd     ee
5  1  5   B    e ee_ff    ee     ff
6  1  6   A    f ff_gg    ff     gg
7  1  7   C    g gg_hh    gg     hh
8  1  8   A    h ii_jj    ii     jj
9  1  9   B    i jj_kk    jj     kk
10 1 10   B    j kk_ll    kk     ll

-- 
David.



More information about the R-help mailing list