[R] levels of comma separated data

Stefan stefan at inizio.se
Fri May 25 10:46:15 CEST 2012


analyst41 <at> hotmail.com <analyst41 <at> hotmail.com> writes:

> 
> I have a data set that has some comma separated strings in each row.
> I'd like to create a vector consisting of all distinct strings that
> occur.  The number of strings in each row may vary.
> 
> Thanks for any help.
> 
> 
#
#
# Some data:
d <- data.frame(id = 1:5, 
  text = c('one,two',
    'two,three,three,four',
    'one,three,three,five',
    'five,five,five,five',
    'one,two,three'),
  stringsAsFactors = FALSE
)
#
# 
# A function. I'm not a black belt at this, so there 
# are probably a more efficient way of writing this.
fcn <- function(x){
  a <- strsplit(x, ',') # Split the string by comma
  unique(a[[1]]) # Uniquify the vector
}
#
#
# Use the function with sapply.
sapply(d[,2], fcn)



More information about the R-help mailing list