[R] Diffrerence in "%in%" function to boundry setting via <>

Benjamin Otto b.otto at uke.uni-hamburg.de
Thu Feb 8 11:29:26 CET 2007


Hi,

There is a point which is irritating me currently quite a bit and that is an
aspect of different behaviour between the %in% function and the
smaller/bigger than signs (<>). Here is are two examples to demonstrate what
I mean:

Example1:
> c(1,1,2,2,3,4,4,6,7) %in% c(1,2,3)
[1]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE

Right, that is what I expect.

Example2:
> ps <- seq(-0.502,0.378,by=0.001)
> ps[494]
[1] -0.009

> class(ps[494])
[1] "numeric"
> class(-0.009)
[1] "numeric"
> class(ps[494])
[1] "numeric"

> ps[494] == -0.009
[1] FALSE
> ps[494] %in% -0.009
[1] FALSE
> ps[494] == c(-0.009)
[1] FALSE
> ps[494] %in% c(-0.009)
[1] FALSE
> ps[494] <= -0.008
[1] TRUE
> ps[494] >= -0.010
[1] TRUE
> -0.009 == -0.009
[1] TRUE

BUT: 
> ps[249]
[1] -0.254
> class(ps[249])
[1] "numeric"
> ps[249] %in% -0.254
[1] TRUE

OK! Can sombody explain to me what is happening here? Honestly? I don't
understand where the difference but it's critical! Because obviuosly when I
have a set of numeric values (ALL have three digits) and to boundry values
lb/up, a lower and an upper boundry, I could (from what I thought until now)
chosse between:

Version1:
> small.set <- set[set %in% seq(lb,up,by=0.001)]

Version2:
> small.set <- set[set >= lb & set <= up]

Unfortunately with my data I used I got around 8000 values from my set with
version1 but about 24000 with version2. IS there some main diffrence I
didn't take into account or is my system just behaving irrational (that's
what I think if you look at Example2)?

I checked the behaviour under R-2.4.1 (Windows) and under 2.2.1 (Linux). The
result was the same.

Sincere regards

Benjamin Otto 

-- 
Benjamin Otto
Universitaetsklinikum Eppendorf Hamburg
Institut fuer Klinische Chemie
Martinistrasse 52
20246 Hamburg



More information about the R-help mailing list