[R] cut with floating point, a bug?

Duncan Murdoch murdoch at stats.uwo.ca
Fri Jun 19 09:09:49 CEST 2009


Shawn Rutledge wrote:
> With floating point numbers I'm seeing 'cut' putting values in the wrong
> bands. An example below places 0.3 in (0.3,0.6] i.e. 0.3 > 0.3.
>
>   
>> x = 1:5*.1
>> x
>>     
> [1] 0.1 0.2 0.3 0.4 0.5
>   
>> cut(x, br=c(0,.3,.6))
>>     
> [1] (0,0.3]   (0,0.3]   (0.3,0.6] (0.3,0.6] (0.3,0.6]
> Levels: (0,0.3] (0.3,0.6]
>
> I'm sure this is probably the same issue documented in the FAQ (7.31 Why
> doesn't R think these numbers are equal?)
> http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
>
> [1] Is there a way to make cut work correctly (a code fix)?
>   
It is working correctly.  The third element of x is bigger than 0.3.
> [2] Is there a workaround for using the current cut?
>   

You could round all values to the same number of decimal places.
> [3] Why does 'hist' work correctly on the same data?
>   

See ?hist.  It applies a numerical tolerance when working on the edges 
of bins.

Duncan Murdoch

>   
>> table(cut(x, br=c(0,.3,.6)))
>>     
>   (0,0.3] (0.3,0.6] 
>         2         3 
>   
>> hist(x, br=c(0,.3,.6), plot=F)$counts
>>     
> [1] 3 2
>
>   
>> sessionInfo()
>>     
> R version 2.9.0 (2009-04-17) 
> i386-pc-mingw32 
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list