[R] Why strsplit can be used with matrix but not data.frame?

David Winsemius dwinsemius at comcast.net
Thu Sep 17 03:30:45 CEST 2009


On Sep 16, 2009, at 9:22 PM, Peng Yu wrote:

> Hi,
>
> As show in the code below, strsplit can be applied to a matrix but not
> a data.frame. I don't understand why R is designed in this way. Can
> somebody help me understand it? How to split all the strings in x$y?
>
> x=data.frame(x=1:10,y=rep("abc",10))
> strsplit(x$y,'b') #Error in strsplit(x$y, "b") : non-character  
> argument
> y=cbind(1:10,rep("abc",10))
> strsplit(y[,2],'b')

You've been tripped up by the factor demon.

  ?strsplit
  str(x)

'data.frame':	10 obs. of  2 variables:
  $ x: int  1 2 3 4 5 6 7 8 9 10
  $ y: Factor w/ 1 level "abc": 1 1 1 1 1 1 1 1 1 1


There is an option:
stringsAsFactors:   The default setting for arguments of data.frame  
and read.table.

Which if changed to FALSE would allow you to "design" as you see fit.

-- 
David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list