[R] 2 small problems: integer division and the nature of NA

Peter Dalgaard p.dalgaard at biostat.ku.dk
Fri Feb 4 17:42:03 CET 2005


Denis Chabot <chabotd at globetrotter.net> writes:

> Hi,
> 
> I'm wondering why
> 
> 48 %/% 2 gives 24
> but
> 4.8 %/% 0.2 gives 23...
> I'm not trying to round up here, but to find out how many times
> something fits into something else, and the answer should have been
> the same for both examples, no?

Well, you can't trust floating point numbers to give you an exact
result:

> 4.8 / 0.2 - 24
[1] -3.552714e-15

and even

> (48/10) / (2/10) - 24
[1] -3.552714e-15

the basic issue being that tenths are not exactly representable in
binary floating point. I think very few people even expected you to
use integer division on non-integers, but I note that the claim on the
help page actually holds:

> 0.2 * 4.8 %/% 0.2  + 4.8 %% 0.2 == 4.8
[1] TRUE

 
> On a different topic, I like the behavior of NAs better in R than in
> SAS (at least they are not considered the smallest value for a
> variable), but at the same time I am surprised that the sum of NAs is
> 0 instead of NA.
> 
> The sum of a vector having at least one NA but also valid data gives
> NA if we do not specify na.rm=T. But with na.rm=T, we are telling sum
> to give the sum of valid data, ignoring NAs that do not tell us
> anything about the value of a variable. I found out while getting the
> sum of small subsets of my data (such as when subsetting by several
> variables), sometimes a "cell" only contained NAs for my response
> variable. I would have expected the sum to be NA in such cases, as I
> do not have a single data point telling me the value of my response
> here. But R tells me the sum was zero in that cell! Was this behavior
> considered "desirable" when sum was built? If not, any hope it will be
> fixed?

Yes it was, and no there isn't. In math, the sum over an empty index
set is zero, which has some nice consistency properties (the sum over
a disjoint union of sets is the sum of the sums over each set, for
instance. 

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907




More information about the R-help mailing list