[R] That dreaded floating point trap

Thu Mar 31 15:39:16 CEST 2011

On 11-03-31 7:24 AM, Alexander Engelhardt wrote:
> Hi,
> I had a piece of code which looped over a decimal vector like this:
>
>
> for( i in where ){
>     thisdata<- subset(herde, herde$mlr>= i)
>     # do stuff with thisdata..
> }
>
> 'where' is a vector like seq(-1, 1, by=0.1)

The solution to this problem is to take steps by representable numbers, 
not by numbers like 0.1 that can't be represented exactly.  For example, 
seq(-1, 1, by=0.25) has exact entries, because fractions with small 
powers of 2 in the denominator are all exactly representable.  ("Small" 
depends on the numerator, but for fractions between 0 and 1 it's about 
52, so not really so small.)

Duncan Murdoch

>
> My problem was: 'nrow(thisdata)' in loop repetition 0.4 was different if
> 'where' was seq(-1, 1, by=0.1) than when 'where' was seq(-0.8, 1, by=0.1)
> It went away after I changed the first line to:
>
>     thisdata<- subset(herde, herde$mlr>= round(i, digits=1))
>
> This is that "floating point trap" the R inferno pdf talked about,
> right? That file talked about the problem, but didn't offer a solution.
>
> Similar things happened when I created a table() from a vector with
> values in seq(-1, 1, by=0.1)
>
> Do I really have to round every float at every occurence from now on, or
> is there another solution? I only found all.equal() and identical(), but
> I want to subset for observations with a value /greater/ than something.
>
> Thanks in advance,
>    Alex
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.