[R] Why strsplit can be used with matrix but not data.frame?

Peng Yu pengyu.ut at gmail.com
Thu Sep 17 03:41:14 CEST 2009


On Wed, Sep 16, 2009 at 8:30 PM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Sep 16, 2009, at 9:22 PM, Peng Yu wrote:
>
>> Hi,
>>
>> As show in the code below, strsplit can be applied to a matrix but not
>> a data.frame. I don't understand why R is designed in this way. Can
>> somebody help me understand it? How to split all the strings in x$y?
>>
>> x=data.frame(x=1:10,y=rep("abc",10))
>> strsplit(x$y,'b') #Error in strsplit(x$y, "b") : non-character argument
>> y=cbind(1:10,rep("abc",10))
>> strsplit(y[,2],'b')
>
> You've been tripped up by the factor demon.
>
>  ?strsplit
>  str(x)
>
> 'data.frame':   10 obs. of  2 variables:
>  $ x: int  1 2 3 4 5 6 7 8 9 10
>  $ y: Factor w/ 1 level "abc": 1 1 1 1 1 1 1 1 1 1
>
>
> There is an option:
> stringsAsFactors:   The default setting for arguments of data.frame and
> read.table.
>
> Which if changed to FALSE would allow you to "design" as you see fit.

I see that I can specify 'F' for stringsAsFactors when I initialize a
data.frame. But if I already have a data.frame, how to change the
'stringsAsFactors' option of it?

    data.frame(..., row.names = NULL, check.rows = FALSE,
               check.names = TRUE,
               stringsAsFactors = default.stringsAsFactors())

Regards,
Peng




More information about the R-help mailing list