[R] cut2 not binning interval endpoints correctly

William Dunlap wdunlap at tibco.com
Wed Nov 27 18:51:39 CET 2013


You can look at the source code of Hmisc::cut2() to see what is going on -- it does
 a lot more than calling cut() with different default arguments.  Another
approach to debugging this is to use trace() to see what cut2() passes down
to the default cut method:

> trace(cut.default, quote(cat("   x=", deparse(x), "\n   breaks=", deparse(breaks), "\n")))
Tracing function "cut.default" in package "base"
[1] "cut.default"
> z <- cut2(c(0.30800), seq(0,1,0.001)[306:315], oneval=FALSE)
Tracing cut.default(x, k2) on entry 
   x= 0.308 
   breaks= c(0.3045, 0.3055, 0.3065, 0.3075, 0.3085, 0.3095, 0.3105, 0.3115,  0.3125, 0.314) 
> z
[1] [0.308,0.309)
9 Levels: [0.305,0.306) [0.306,0.307) [0.307,0.308) ... [0.313,0.314]

I.e., this has little to do with floating point errors in cut(). 

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of S Ellison
> Sent: Wednesday, November 27, 2013 9:12 AM
> To: r-help at r-project.org
> Subject: Re: [R] cut2 not binning interval endpoints correctly
> 
> 
> 
> > -----Original Message-----
> >jim holtman <jholtman at gmail.com>
> > You need to look at the full accuracy of the number representation:
> Um... I think I did. But I'm not sure you did....
> print(..., digits=20) has used different numbers of digits for your two print()s, probably
> because print() decided it needed more digits for the multi-valued vector. The internal
> representations were the same. Try
> 
> print(seq(0, 0.310, 0.001)[309], digits = 20)
> [1] 0.307999999999999996
> 
> print(seq(0, 0.310, 0.001)[309], digits = 22)
> [1] 0.3079999999999999960032
> 
> > print(0.308, digits = 22)
> [1] 0.3079999999999999960032
> 
> 0.308 does match the cut boundary 'exactly' in this case (which is why the usually unwise
> '==' returned TRUE), though neither is exactly 0.308.
> 
> Nonetheless, I understand that FAQ 7.31 is a good candidate for other 'unexpected' cut2
> results. However, that isn't the whole story. It doesn't explain the corresponding cut(,
> right=FALSE) result, which should give the same answer as cut2 if finite representation
> were the sole cause. So there's summat else going on.
> 
> 
> Steve E
> 
> 
> 
> *******************************************************************
> This email and any attachments are confidential. Any use...{{dropped:8}}
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list