[R] strange split behavior?

Peter Dalgaard p.dalgaard at biostat.ku.dk
Wed Sep 23 08:24:39 CEST 2009


Peng Yu wrote:
> Hi,
> 
> Please see the command with a comment below. I don't find
> 'A630039F22Rik' in y. But 'A630039F22Rik' is in z. Can somebody let me
> know what the problem is?

Most obvious guess is that your  factor y has a level that is not 
present in data. That is perfectly normal, even desirable in some cases.

e.g., (sorry about the different names)

 > f <- factor(rep(1,4),levels=0:1)
 > y <- 1:4
 > split(y,f)
$`0`
integer(0)

$`1`
[1] 1 2 3 4

 > table(f)
f
0 1
0 4


> Regards,
> Peng
> 
>> str(x)
>  int [1:365494] 6 7 8 14 15 18 19 21 25 29 ...
>> str(y)
>  Factor w/ 29904 levels "0610005C13Rik",..: 17261 28617 15927 15462
> 8988 23500 16577 20250 27911 13981 ...
>> z=split(x,y)
>> str(z[5529])
> List of 1
>  $ A630039F22Rik: int(0)
>> which(y=='A630039F22Rik')#it is weird
> integer(0)
>> str(z[1])
> List of 1
>  $ 0610005C13Rik: int [1:5] 592506 735015 958481 979622 1124670
>> which(y=='0610005C13Rik')
> [1] 181073 224717 292981 299543 343964
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907




More information about the R-help mailing list