[R] dataframe: string operations on columns

Waclaw Kusnierczyk waku at idi.ntnu.no
Wed Jan 19 02:02:49 CET 2011


Assuming every row is split into exactly two values by whatever string 
you choose as split, one fancy exercise in R data structures is

     dfsplit = function(df, split)
         as.data.frame(
             t(
                 structure(dim=c(2, nrow(df)),
                     unlist(
                         strsplit(split=split,
                             as.matrix(df))))))

so that if your data frame is

     df = data.frame(c('1 2', '3 4', '5 6'))

then

     dfsplit(df, ' ')
     #   V1 V2
     # 1  1  2
     # 2  3  4
     # 3  5  6

renaming the columns left as an exercise.

vQ

On 01/18/2011 05:22 PM, Peter Ehlers wrote:
> On 2011-01-18 08:14, Ivan Calandra wrote:
>> Hi,
>>
>> I guess it's not the nicest way to do it, but it should work for you:
>>
>> #create some sample data
>> df<- data.frame(a=c("A B", "C D", "A C", "A D", "B D"),
>> stringsAsFactors=FALSE)
>> #split the column by space
>> df_split<- strsplit(df$a, split=" ")
>>
>> #place the first element into column a1 and the second into a2
>> for (i in 1:length(df_split[[1]])){
>>    df[i+1]<- unlist(lapply(df_split, FUN=function(x) x[i]))
>>    names(df)[i+1]<- paste("a",i,sep="")
>> }
>>
>> I hope people will give you more compact solutions.
>> HTH,
>> Ivan
>>
> You can replace the loop with
>
>  df <- transform(df, a1 = sapply(df_split, "[[", 1),
>                      a2 = sapply(df_split, "[[", 2))
>
> Peter Ehlers
>
>>
>>
>> Le 1/18/2011 16:30, boris pezzatti a écrit :
>>>
>>> Dear all,
>>> how can I perform a string operation like strsplit(x," ")  on a column
>>> of a dataframe, and put the first or the second item of the split into
>>> a new dataframe column?
>>> (so that on each row it is consistent)
>>>
>>> Thanks
>>> Boris
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list