[R] strange split behavior?

Peng Yu pengyu.ut at gmail.com
Wed Sep 23 14:29:30 CEST 2009


On Wed, Sep 23, 2009 at 1:24 AM, Peter Dalgaard
<p.dalgaard at biostat.ku.dk> wrote:
> Peng Yu wrote:
>>
>> Hi,
>>
>> Please see the command with a comment below. I don't find
>> 'A630039F22Rik' in y. But 'A630039F22Rik' is in z. Can somebody let me
>> know what the problem is?
>
> Most obvious guess is that your  factor y has a level that is not present in
> data. That is perfectly normal, even desirable in some cases.
>
> e.g., (sorry about the different names)
>
>> f <- factor(rep(1,4),levels=0:1)
>> y <- 1:4
>> split(y,f)
> $`0`
> integer(0)
>
> $`1`
> [1] 1 2 3 4
>
>> table(f)
> f
> 0 1
> 0 4

I see. The problem is that I extract a subset of a factor ('fdata' in
the following case). I thought that only a subset of factor levels
will be returned. But it is not.

Is there an operation on a factor to get a subset and keep only the
corresponding levels (see commented line below)?

> data=c(2,2,3,-1,1)
> fdata=factor(data)
> fdata[1:2]#Levels have all values. But I only want "2".
[1] 2 2
Levels: -1 1 2 3
> levels(fdata[1:2])
[1] "-1" "1"  "2"  "3"
>
> as.vector(levels(fdata))
[1] "-1" "1"  "2"  "3"
>




More information about the R-help mailing list