[R] struggling with "split" function

Dimitri Liakhovitski ld7631 at gmail.com
Mon Sep 7 03:49:21 CEST 2009


Found a mistake - it was mine!
Thanks a lot for your help!

On Sun, Sep 6, 2009 at 8:43 AM, Dimitri Liakhovitski<ld7631 at gmail.com> wrote:
> Thanks a lot, Dimitris.
> It totally works on my example data frame.
> I know, it's probably hard to address, but when I try to apply it to
> the real huge data frame I have, after the last line I get:
> Error in `[.default`(x$A, na.ind, -1) :  incorrect number of dimensions.
> I know it's impossible to answer this question without seeing the
> data, but still: what do you think might be wrong?
>
> Do you think it could be because my first column contains something
> else but the "split"? No, I've just run the table on A and it is:
> split <NA>
> 204 6356
>
> I also checked the first dimension of x and the length(na.ind) - the
> are the same length: 6560.
>
> No idea where the error might lye...
>
>
> Thanks a lot!
> Dimitri
>
> On Sun, Sep 6, 2009 at 5:43 AM, Dimitris
> Rizopoulos<d.rizopoulos at erasmusmc.nl> wrote:
>> one way is the following:
>>
>> ind <- rle(is.na(x$A))
>> ind <- rep(seq_along(ind$lengths), ind$lengths)
>> na.ind <- is.na(x$A)
>> split(x[na.ind, -1], ind[na.ind])
>>
>>
>> I hope it helps.
>>
>> Best,
>> Dimitris
>>
>>
>> Dimitri Liakhovitski wrote:
>>>
>>> I am very sorry for such a simple question, but I am struggling with
>>> "split".
>>> I have the following data frame:
>>>
>>> x<-data.frame(A=c(NA,NA,NA,NA,"split",NA,NA,NA,NA,"split",NA,NA,NA,NA,"split",NA,NA,NA,NA),
>>>
>>> B=c("Name1","text1","text2","text3",NA,"Name2","text1","text2","text3",NA,"Name3","text1","text2","text3",NA,"Name4","text1","text2","text3"),
>>>
>>> C=c(NA,1,NA,3,NA,NA,4,5,6,NA,NA,7,8,9,NA,NA,3,3,3),D=c(NA,1,1,2,NA,NA,5,6,NA,NA,NA,9,8,7,NA,NA,2,2,2),
>>> E=c(NA,3,2,1,NA,NA,6,5,4,NA,NA,7,7,8,NA,NA,1,NA,1))
>>> print(x)
>>>
>>> All I want to do is to split x, i.e., to create a list of data frames
>>> that are currently separated by the word "split" in column A. In this
>>> example, it would be 4 data frames, the first of them being:
>>> A B C D E
>>> NA Name1 NA NA NA
>>> NA text1 1 1 3
>>> NA text 2 NA 1 2
>>> NA text3 3 2 1
>>>
>>> etc.
>>>
>>> I tried:
>>> split(x, x$A)
>>> split(x,x$A == 'split')
>>> split(x,!is.na(x$A))
>>>
>>> But nothing produces what I need.
>>> Tanks a lot for any hint!
>>>
>>
>> --
>> Dimitris Rizopoulos
>> Assistant Professor
>> Department of Biostatistics
>> Erasmus University Medical Center
>>
>> Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
>> Tel: +31/(0)10/7043478
>> Fax: +31/(0)10/7043014
>>
>
>
>
> --
> Dimitri Liakhovitski
> Ninah.com
> Dimitri.Liakhovitski at ninah.com
>



-- 
Dimitri Liakhovitski
Ninah.com
Dimitri.Liakhovitski at ninah.com




More information about the R-help mailing list